Data Collection
Data Storage
Variables
Methodology
Data Visualization
100

Most common channel for data collection

What is internal acquisition?

100

Well-organized info that is often stored in spreadsheets

What is structured data?

100

Type of variable that includes names or symbols of objects

What is a categorical variable?

100

Statistical method that focuses on discovery and exploration

What is data mining?

100

Used to describe trends over time

What is a line chart?

200

Data stored across a network of computer servers

What is a distributed file system?

200

A form of storage used for unconventional and unstructured data

What is a data lake?

200

Another name for categorical variables

What are a nominal variables?

200

Gives computers the ability to learn without being programmed

What is machine learning?

200

Similar to a bar chart with no margin between bars

What is a histogram?

300

Outsourcing tasks to a remote and distributed workforce

What is crowdsourcing?

300

Large amounts of information that defy conventional methods of processing

What is big data?

300

Type of variable that categorizes values in a meaningful sequence

What is an ordinal variable?

300

Machine learning that uncovers patterns between inputs/outputs

What is supervised learning?

300

Used to show relationships between variables

What is a scatterplot?

400

Info mined from non-traditional sources

What is alternative data?

400

Defines where data can be placed inside relational databases

What is a schema?

400

Type of variable that is expressed and processed mathematically

What is a numeric variable?

400

Type of analytics that compresses info into easily readable format

What is descriptive analytics?

400

Used for displaying the distribution of a set of continuous data

What is a box plot?

500

Collecting info from the web using code and automation

What is web scraping?

500

A network of connected servers

What is a node?


500

Binary value that produces one of two set outcomes

What is a Boolean variable?

500

Machine learning that achieves specific output through random trial

What is reinforcement learning?

500

Shows correlation between variables as colors in a matrix

What is a heatmap?