Mean, Median, Standard Deviation
Sampling Theory
Hypothesis Testing
Regression Analysis
Miscellaneous
100

The arithmetic average for a list of numbers

mean

100

A subset of observations drawn from a bigger set of observations.

Sample

100

The hypothesis that says: "Nah, there's no relationship there."

Null hypothesis

100

This measure tells you whether the relationship between X and Y is positive or negative, and how strong that relationship is.

Correlation (r)

100

Type of research where you begin by making observations and moving on to come up with a theory

Induction

200

The __________ is the center value of a sorted list

median

200

The _________ says: "1 standard deviation away from the mean contains 68% of the data; 2 standard deviations contain ~95% of the data; and 3 standard deviations contain ~99.7% of the data."

Empirical Rule

200

Your test statistic needs to be larger than the ________ for your to reject the null hypothesis.

critical value

200

Linear regression is appropriate for comparing 2 or more variables that are ___________ -level measurements.

interval or ratio

200

Another word for a bell curve

Normall distribution

300

_______ is a measure of how much the data scatters around the mean.

Standard deviation

300

The __________ says: "As the sample size increases, the observed values converge on the expected values."

Law of Large Numbers

300

_______ is the test statistic you would use when comparing 2 means

t-score

300

Lance did a regression to see if caffeine consumption increases the amount of weight that people can lift. After he did the analysis, he said: 

"I found that with every additional cup of coffee, people lift 10 pounds more weight."

What measure is he reporting on?

Slope coefficient

300

What's the Fundamental Problem of Causal Inference?

it's that we cannot observe the counterfactual
400

Which measure of center (mean or median) is affected the MOST by an outlier?

Mean

400

A ____________ is a distribution made up of an infinite number of sample statistics.

Sampling distribution

400

The difference between the treatment group and the control group.

Treatment Effect

400

What is the difference between the predicted value and the observed value called?

Error (or residual)

400

True or False:

Theory is a simple statement about the relationship between cause and effect. Hypothesis gives the reasons why one causes the other.


False

500

Why is standard deviation better than variance when talking about dispersion around the mean?

Because the variance squares the distance of each observation from the mean, so it gets too big for the context of the data. Standard deviation is measured by the same units as the actual data. 

500

The ________ says: "With an infinite number of sample statistics, the sampling distribution approaches normality around the population parameter it estimates."

Central Limit Theorem

500
After doing some math, Dave exclaimed: "If we take an infinite number of samples of the same size, then the population mean will fall between 150 and 153 95% of the time!" 


What measure is Dave reporting?

Confidence Interval

500

What do we need to see to conclude that an independent variable is "statistically significant?"

p-value of less than 0.05

500

Why are randomized experiments better at causal inference than observational research?

Because the researcher can manipulate the independent variable, and therefore avoid threats to causal inference (like spurious correlation).

M
e
n
u