Data Science Basics
Statistics
ROCKies
Random Pot Luck
Data Visualization
100

This is the first step in any data science workflow, involving collecting and organizing raw data.

What is data acquisition?

100

This measure of central tendency is the middle value in a sorted dataset.

What is median?

100

This term refers to climbing without ropes or protection, usually on short routes.

What is bouldering?

100

This country hosted the 2016 Summer Olympics.

What is Brazil?

100

This chart type is best for showing proportions of a whole.

What is a pie chart?

200

This term refers to cleaning and preparing data for analysis.

What is data wrangling?

200

This type of distribution is symmetric and bell-shaped.

What is normal distribution?

200

This is the highest peak in Colorado.

What is Mount Elbert?

200

This billionaire founded SpaceX.

Who is Elon Musk?

200

This visualization shows the distribution of a single variable using bins.

What is a histogram?

300

This programming language is widely used for data analysis and visualization.

What is Python?

300

This test compares means between two groups to see if they differ significantly.

What is t-test?

300

Who is the only player to be inducted into the Baseball Hall of Fame as a Colorado Rockies?

Who is Larry Walker?

300

This Pixar movie features a clownfish searching for his son.

What is Finding Nemo?

300

This plot is ideal for visualizing relationships between two continuous variables.

What is a scatter plot?

400

This concept measures how well a model generalizes to unseen data.

What is overfitting?

400

This metric measures the strength and direction of a linear relationship between two variables.

What is correlation coefficient?

400

This grading system is widely used in the U.S. for rating climbing routes.

What is the Yosemite Decimal System (YDS)?

400

This process by which plants make food using sunlight is called ____.

What is photosynthesis?

400

This visualization technique reduces high-dimensional data into 2D or 3D for plotting.

What is PCA (Principal Component Analysis)?

500

This term describes the practice of splitting data into training, validation, and test sets to evaluate model performance.

What is data partitioning?

500

This theorem states that sample means approach a normal distribution as sample size increases.

What is the Central Limit Theorem?

500

This geological feature in the Rockies is known for its dramatic red sandstone formations near Colorado Springs.

What is Garden of the Gods?

500

This treaty ended World War I in 1919.

What is the Treaty of Versailles?

500

This type of plot uses hexagonal bins to handle overplotting in large datasets.

What is a hexbin plot?

M
e
n
u