What does the correlation coefficient r measure between two quantitative variables?
The strength and direction of a linear relationship
In simple linear regression, which variable is predicted?
The response variable (y).
What is extrapolation?
Predicting outside the observed x-range.
Formula for relative frequency?
(# event occurrences) / (total trials).
Interpret P(A | B) in words.
“The probability of A given B.”
If r = 0.95, describe the association.
Strong positive linear association.
Define a residual.
Residual = observed y − predicted ŷ.
What is a regression outlier?
A point far from the trend of the rest of the data.
Valid range of a probability?
From 0 to 1, inclusive.
In symbols, what does the vertical slash “|” mean?
“Given.”
If r is near 0, what does that indicate?
Weak or no linear relationship (may still be non-linear).
What does a large residual suggest about an observation?
It’s unusual or far from the fitted line.
Define Simpson’s Paradox.
The direction of an association reverses when a third variable is considered.
When are two events independent?
When the outcome of one does not affect the other.
When are events disjoint (mutually exclusive)?
When they have no outcomes in common.
True or False: A high correlation implies causation.
False—correlation ≠ causation.
Which line minimizes the sum of squared residuals?
The least squares regression line.
What is a lurking variable?
An unmeasured variable that influences the relationship.
General addition rule: P(A or B) = ?
P(A) + P(B) − P(A and B).
If P(A) = 0.4 and P(B) = 0.5 and A, B are independent, find P(A and B).
0.20
Name one limitation of correlation.
It only captures linear relationships and is sensitive to outliers.
The slope (b) of the least-squares line is directly related to which statistic, and what else influences the intercept?
Correlation (r) relates to slope; intercept depends on slope (and the means of x and y).
Difference between lurking and confounding variables?
Lurking: not measured; Confounding: measured but entangled with other explanatory variables.
Multiplication rule for independent A and B?
P(A and B) = P(A) × P(B).
Define a probability model.
A specification of possible outcomes and assumptions/probabilities for events in the sample space.