The arithmetic average for a list of numbers
mean
A subset of observations drawn from a bigger set of observations.
Sample
The hypothesis that says: "Nah, there's no relationship there."
Null hypothesis
This measure tells you whether the relationship between X and Y is positive or negative, and how strong that relationship is.
Correlation (r)
Type of research where you begin by making observations and moving on to come up with a theory
Induction
The __________ is the center value of a sorted list
median
The _________ says: "1 standard deviation away from the mean contains 68% of the data; 2 standard deviations contain ~95% of the data; and 3 standard deviations contain ~99.7% of the data."
Empirical Rule
Your test statistic needs to be larger than the ________ for your to reject the null hypothesis.
critical value
Linear regression is appropriate for comparing 2 or more variables that are ___________ -level measurements.
interval or ratio
Another word for a bell curve
Normall distribution
_______ is a measure of how much the data scatters around the mean.
Standard deviation
The __________ says: "As the sample size increases, the observed values converge on the expected values."
Law of Large Numbers
_______ is the test statistic you would use when comparing 2 means
t-score
Lance did a regression to see if caffeine consumption increases the amount of weight that people can lift. After he did the analysis, he said:
"I found that with every additional cup of coffee, people lift 10 pounds more weight."
What measure is he reporting on?
Slope coefficient
What's the Fundamental Problem of Causal Inference?
Which measure of center (mean or median) is affected the MOST by an outlier?
Mean
A ____________ is a distribution made up of an infinite number of sample statistics.
Sampling distribution
The difference between the treatment group and the control group.
Treatment Effect
What is the difference between the predicted value and the observed value called?
Error (or residual)
True or False:
Theory is a simple statement about the relationship between cause and effect. Hypothesis gives the reasons why one causes the other.
False
Why is standard deviation better than variance when talking about dispersion around the mean?
Because the variance squares the distance of each observation from the mean, so it gets too big for the context of the data. Standard deviation is measured by the same units as the actual data.
The ________ says: "With an infinite number of sample statistics, the sampling distribution approaches normality around the population parameter it estimates."
Central Limit Theorem
What measure is Dave reporting?
Confidence Interval
What do we need to see to conclude that an independent variable is "statistically significant?"
p-value of less than 0.05
Why are randomized experiments better at causal inference than observational research?
Because the researcher can manipulate the independent variable, and therefore avoid threats to causal inference (like spurious correlation).