This is the average of a set of numbers, calculated by dividing the sum by the count.
Mean
This step comes before analysis and includes cleaning, transforming, and organizing data.
Data Wrangling or Preprocessing
You flip a fair coin 3 times. This is the chance you’ll see exactly two heads
What is "3/8"
Statistics is commonly used in this scientific process to test hypotheses.
What is “experimentation”?
In Python, this library is commonly used for data manipulation and is built on top of NumPy?
Pandas
This measure of spread is calculated as the square root of the variance.
Standard Deviation
When values in a dataset are unusually high or low compared to the rest, they’re called this.
Outliers
A bag has 1 green, 1 yellow, and 1 purple marble. You randomly pick one. This is the probability it’s not purple.
What is 2/3?
This prefix is applied to the word “statistics” to indicate the application of statistics in relation to living organisms, often the main application of statistics that the National Institute of Health uses.
What is “bio”?
This term refers to a function that calls itself during execution. It’s useful for tasks like traversing trees.
This theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases.
Central Limit Theorem
To test a claim about a population using a sample, you’d use this type of reasoning.
Inferential Statistics
If three people are in a room, the probability that no one shares a birthday is approximately this.
What is 364/365 × 363/365?
Statistics is not only used in STEM related fields, name this application of Statistics often used for voting.
What is “polling”?
This data structure follows a First In First Out order.
Queue
This value separates the higher half from the lower half of a data set when arranged in order.
Median
You suspect cheating in a dataset where all digits start with 9. This statistical law might be violated.
Benford's Law
You roll two dice. These are the number of outcomes that add up to 7.
What is 6?
In engineering, statistics is linked to a key principle, which is often defined as a potential for loss or harm due to uncertainty or variability, name this key principle.
What is "risk"?
In Python, this method is used to apply a function to every element in a DataFrame column.
".apply()"
In hypothesis testing, this is the probability of observing your data—or something more extreme—if the null hypothesis is true.
p-value
You notice a correlation between ice cream sales and shark attacks. A data detective would say this type of variable is likely involved.
If there were 100 doors and Monty reveals 98 goats after your choice, this is the probability of winning if you switch.
What is 99/100?
What laws state that the complement of the union of two sets is equal to the intersection of the respective complements and that the complement of the intersection of two sets is the same as the union of their individual complements?
What is "De Morgan's Laws"?
This command-line tool is used to manage Python packages, often paired with virtual environments
pip