Data Collection
Data Storage
Variables
Methodology
Data Visualization
100

Most common channel for data collection

What is internal acquisition?

100

Well-organized info that is often stored in spreadsheets

What is structured data?

100

Type of variable that includes names or symbols of objects

What is a categorical variable?

100

Statistical method that focuses on discovery and exploration

What is data mining?

100

Used to describe trends over time

What is a line chart?

200

Data stored across a network of computer servers

What is a distributed file system?

200

A form of storage used for unconventional and unstructured data

What is a data lake?

200

Another name for categorical variables

What are a nominal variables?

200

In data cleaning, these are one-off observations where, at a glance, they do not appear to fit within the data you are analyzing.

What are (unwanted) outliers?
200

Similar to a bar chart with no margin between bars

What is a histogram?

300

Outsourcing tasks to a remote and distributed workforce

What is crowdsourcing?

300

Large amounts of information that defy conventional methods of processing

What is big data?

300

Type of variable that categorizes values in a meaningful sequence

What is an ordinal variable?

300

This type of DA allows easy interpretation of large volumes of data to identify new opportunities.

What is business intelligence?

300

Used to show relationships between variables

What is a scatterplot?

400

Info mined from non-traditional sources

What is alternative data?

400

Defines where data can be placed inside relational databases

What is a schema?

400

Type of variable that is expressed and processed mathematically

What is a numeric variable?

400

Type of analytics that compresses info into easily readable format

What is descriptive analytics?

400

Used for displaying the distribution of a set of continuous data

What is a box plot?

500

Collecting info from the web using code and automation

What is web scraping?

500

A network of connected servers

What is a node?


500

Binary value that produces one of two set outcomes

What is a Boolean variable?

500

This is the study of collection, analysis, interpretation, presentation, and organization of data.

What is statistics?

500

Shows correlation between variables as colors in a matrix

What is a heatmap?

M
e
n
u