Data Science
Production Environments
Data Visualization
Data Engineering
fun
100

The creator of modern data science?

Who is C.M. Tracy

100

This web-based hosting service is commonly used for version control of code repositories.

what is github

100

This type of chart is used to display the frequency distribution of a dataset and is often used to show the distribution of numerical data.

what is a histogram

100

This type of database stores data in tables and uses SQL for querying.

What is a relational database

100

the chocolate shop around the corner

what is madhu chocolate

200

This fundamental concept in machine learning refers to the trade-off between the model's ability to minimize errors due to underfitting and errors due to  overfitting, aiming to find the optimal balance for generalization to unseen data.

what is the bias variance tradeoff

200

What is the most powerful tool a data scientist has at their disposal

what is a cron job
200

This type of plot displays the relationship between two quantitative variables, using points to represent values.

wot is a scatter plot

200

This process involves extracting data from various sources, transforming it into a suitable format, and loading it into a data warehouse.

what is etl?


200

The place that collin recommended for a data science dinner many years ago (also the best restaurant in austin)

intero :) 

300

This unsupervised learning algorithm is commonly used for clustering tasks by partitioning the dataset into a set number of clusters, with each data point assigned to the cluster with the nearest mean.

What is k means clustering

300

In software development, this process involves preparing and configuring code for deployment to a production environment, often including steps such as testing, optimization, and packaging.

What is deployment

300

This technique visualizes text data by word frequency

What is a word cloud?

300

This open-source framework is used for distributed storage and processing of large datasets using the MapReduce programming model.

what is hadoop

300

the best hot dog place in chicago that recently opened a store in the DFW AND where some guy spilled his chocolate milkshake on my white nikes

portillos

400

This algorithm, known for its simplicity and effectiveness, is used for binary classification tasks by finding the hyperplane that best separates two classes in the feature space.

what is a SVM


400

This practice involves creating identical copies of production environments, including servers, databases, and configurations, for testing, development, and staging purposes.

What is environment cloning or staging

400

This chart is commonly used to show hierarchical or tree-structured data

What is a Dendrogram?

400

This cloud-based data warehousing platform separates storage and compute, allowing for scalable and efficient data processing and querying, and sometimes they take us to soccer games

what is snowflake

400

something that a lot of us do at the beginning of our notebooks that we should STOP doing

what is using alex's personal snoflake password 

500

his deep learning architecture, designed for image processing tasks, uses layers of convolutional filters to automatically learn spatial hierarchies of features from input images.

what is a convolutional neural network


500

what should we do b4 deploying to prod

Test everything :) 

500

In Matplotlib, this parameter of the scatter() function is used to set the transparency level of the markers, allowing for better visualization of overlapping points in dense scatter plots.

what is the alpha parameter

500

This tool is used for transforming data in the warehouse and allows data analysts and engineers to write data transformation workflows using SQL.

what is DBT

500

sules favorite car accessory

what is glasses with eyelashes

M
e
n
u