This is the average of a set of numbers.
What is the mean?
This type of learning trains a model using examples with answers.
What is supervised learning?
This type of graph uses bars of different lengths to show data.
What is a bar chart?
This programming language is known for its simple syntax and is popular in data science.
What is Python?
In data science, what does "NaN" stand for when representing missing or undefined values in a dataset?
Not A Number
Sorting data in ascending order helps you easily find this middle value.
What is the median?
A model that predicts “yes” or “no” is performing this kind of task.
What is classification?
This type of graph is great for showing trends over time.
What is a line chart?
SQL is used to work with this kind of database.
What is a relational database?
This is the first step in most data science projects: asking a ___.
What is a question?
This chart type groups data into bins to show how often values occur.
What is a histogram?
This common algorithm finds a line that best fits the data.
What is linear regression?
This type of plot uses dots to show how two variables relate.
What is a scatter plot?
CSV files separate values using this character.
What is a comma?
This type of chart, graph, or picture helps you see patterns in data.
What is a visualization?
This value describes how spread out data is, often measured by standard deviation.
What is variability (or spread)?
Splitting your data into training and testing sets helps avoid this problem.
What is overfitting?
This chart type is made of slices to show proportions.
What is a pie chart?
This Google product lets you write and run Python in your browser.
What is Google Colab?
This common process divides data into separate sets to train and test a model’s performance.
What is data splitting (or train/test split)?
This type of analysis looks at how two variables move together.
What is correlation?
This tree-based method builds many decision trees and takes a vote.
What is a random forest?
This key element of a chart explains what the colors or shapes mean.
What is a legend?
This is the process of making messy data ready to analyze.
What is data cleaning (or wrangling)?
What are the three main components of the "data science pipeline"?
Data collection, data cleaning/preprocessing, and modeling.