Data Analysis

Machine Learning

Data Visualization
Miscellaneous
Data Science General
100

This is the average of a set of numbers.

What is the mean?

100

This type of learning trains a model using examples with answers.

What is supervised learning?

100

This type of graph uses bars of different lengths to show data.

What is a bar chart?

100

This programming language is known for its simple syntax and is popular in data science.

What is Python?

100

In data science, what does "NaN" stand for when representing missing or undefined values in a dataset?

Not A Number

200

Sorting data in ascending order helps you easily find this middle value.

What is the median?

200

A model that predicts “yes” or “no” is performing this kind of task.

What is classification?

200

This type of graph is great for showing trends over time.

What is a line chart?


200

SQL is used to work with this kind of database.

What is a relational database?

200

This is the first step in most data science projects: asking a ___.

What is a question?

300

This chart type groups data into bins to show how often values occur.

What is a histogram?

300

This common algorithm finds a line that best fits the data.

What is linear regression?

300

This type of plot uses dots to show how two variables relate.

What is a scatter plot?

300

CSV files separate values using this character.

What is a comma?

300

This type of chart, graph, or picture helps you see patterns in data.

What is a visualization?

400

This value describes how spread out data is, often measured by standard deviation.

What is variability (or spread)?

400

Splitting your data into training and testing sets helps avoid this problem.

What is overfitting?

400

This chart type is made of slices to show proportions.

What is a pie chart?

400

This Google product lets you write and run Python in your browser.

What is Google Colab?

400

This common process divides data into separate sets to train and test a model’s performance.

What is data splitting (or train/test split)?

500

This type of analysis looks at how two variables move together.


What is correlation?

500

This tree-based method builds many decision trees and takes a vote.

What is a random forest?

500

This key element of a chart explains what the colors or shapes mean.

What is a legend?

500

This is the process of making messy data ready to analyze.

What is data cleaning (or wrangling)?

500

What are the three main components of the "data science pipeline"?

Data collection, data cleaning/preprocessing, and modeling.