Sampling & Study Design
Hypothesis Testing
Data & Descriptive Stats
Experimental Design & Bias
Probability and Random Variables
100

Definition: Observes individuals and measures variables without assigning treatments.
Example: Studying health habits by surveying people rather than assigning diets.

What is Observational Study

100

Definition: Rejecting a true null hypothesis (false positive).
Example: Saying a new drug works when it doesn't.

What is Type I Error

100

Definition: Minimum, Q1, Median, Q3, Maximum. 

Used to create boxplots.

What is Five-Number Summary

100

Definition: The group in an experiment that does not receive the treatment—used for comparison.

What is Control Group 

100

Definition: A variable whose value is determined by chance.
Types: Discrete or continuous

What is Random Variable 

200

Comparison - Use a control group
Random assignment - Reduces bias
Control - Reduce confounding variables
Replication - Use enough subjects to detect real effects

What is Principles of Experimental Design 

200

Definition: The threshold for rejecting the null hypothesis.
Common values: 0.05, 0.01

What is Significance Level 

200

Definition: The range between the first (Q1) and third (Q3) quartiles.
Formula: IQR = Q3 - Q1Use: Measures spread; helps identify outliers.

What is Inter Quartile Range (IQR)

200

Definition: Choosing individuals who are easy to reach. Often leads to bias.
Example: Surveying only friends about school lunch.

What is Convenience Sampling 

200

Definition: A rule used to find the probability that either of two events occurs.
Formula: P(A or B) = P(A) + P(B) - P(A and B)

What is Addition Rule 

300

Definition: A sampling method where the population is divided into groups, and entire clusters are randomly selected.
Example: Randomly choosing 3 school classes and surveying everyone in them.

What is Cluster Sample 

300

Definition: Failing to reject a false null hypothesis (false negative).
Example: Saying a drug doesn't work when it actually does.

What is Type II Error 

300

Definition: A measure of how spread out the data is from the mean.

What is Standard Deviation 

300

Definition: A variable that affects both the explanatory and response variable, making it hard to determine causation.

Confounding Variable

300

Definition: As the sample size increases, the sample mean gets closer to the population mean.
Example: The more times you flip a fair coin, the closer the proportion of heads gets to 0.5.

What is Law of Large Numbers 

400

Definition: Divide population into groups, then randomly sample from each.
Example: Sampling students from each grade level.

What is Stratified Random Sample 

400

Definition: The probability of getting results as extreme (or more) as observed, assuming the null hypothesis is true.
 

What is P-Value 

400

Definition: The value below which a certain percent of data falls.
 

What is Percentile 

400

Definition: When some groups are left out or underrepresented in a sample.

What is Undercoverage 

400

Definition: The probability that an event does not happen.
Formula: P(not A) = 1 - P(A)

What is Complement

500

Definition: A variable not included in the study that affects both explanatory and response variables.
Example: Ice cream sales and drowning deaths may both be influenced by temperature

What is Lurking Variable

500

Definition: The probability of correctly rejecting a false null hypothesis (avoiding a Type II error).

What is Power

500

Definition: A measure strongly affected by outliers or skewness.

What is Nonresistant Measure 

500

Definition: Keeping subjects or researchers unaware of the treatment group to avoid bias.

What is Blinding

500

Definition: Probability of one event given that another event has occurred. 

Formula: P(A|B) = P(A and B) / P(B)

What is Conditional Probability 

M
e
n
u