Which measure of center is most resistant to outliers?
Median
What is the total area under any density curve?
1
What four features should always be described when analyzing a scatterplot?
Direction, form, strength, and outliers
What does the slope of a least-squares regression line represent?
The predicted change in y for a one-unit increase in x
What is the probability of an impossible event?
0
What is conditional probability?
The probability of an event given that another event has occurred
What is the difference between a discrete and continuous random variable?
Discrete takes countable values with gaps between them; continuous takes all values in an interval
Name 2 types of displays used for a quantitative variable.
Histogram, dotplot, stemplot
If a distribution is symmetric and bell-shaped, how to the mean & median compare?
They are approximately equal
What does a correlation of r = -0.85 indicate?
A strong negative linear relationship
What is a residual?
The difference between the observed value and the predicted value
(y-hat y)
Two events are mutually exclusive. What is P(A ∩ B)?
0
Write the formula for conditional probability P(A | B)
P(A | B) = P(A ∩ B) / P(B)
What must be true about the probabilities in a probability distribution? (2 requirements)
They must all be between 0 and 1, and they must add up to 1
What does the IQR measure?
The spread of the middle 50% of the data
In a Normal Distribution, what percent of data falls within two standard deviations of the mean?
About 95%
True or false: Correlation is resistant to outliers
False
What does r2 represent?
The proportion of variability in y explained by the variability in x (or explained by the linear relationship with x)
State the general addition rule for two events A and B
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
If P(A | B) = P(A), what does that indicate?
Events A and B are independent
What are the 4 characteristics that define a binomial random variable?
Binary outcomes, independent trials, number of trials is fixed, same probability of success in each trial
If a distribution is strongly right-skewed, what is the relationship between the mean and the median?
The mean is greater than the median
What does a z-score represent?
The number of standard deviations a value is from the mean
Why is it incorrect to say that correlation implies causation?
Because a strong association may be due to lurking (confounding) variables or coincidence
Why should you not use a regression model to extrapolate far beyond the data?
Because the relationship may not continue outside the observed range
What does it mean for two events to be mutually exclusive?
They cannot happen at the same time
Two events are independent if P(A ∩ B) = ...
P(A) * P(B)
How does multiplying a random variable by a constant affect its standard deviation?
The standard deviation is multiplied by (the absolute value of) the constant
Describe the four characteristics you should include when describing a quantitative distribution
Shape, center, spread/variability, outliers
Describe how to determine the proportion of observations below a given value in a Normal distribution.
Convert the value to a z-score and use a Normal table or calculator
Describe a situation where two variables might have a strong correlation but to causal relationship
Example: Ice cream sales and shark attacks (both affected by warm temperatures)
What does s represent for a linear regression between two variables?
The standard deviation of the residuals. How much the predicted values differ from the actual values on average.
What does it mean for two events to be independent?
The outcome of one event does not affect the probability of the other event
True or false: Two events can be both independent and mutually exclusive
False
What two conditions must be met to model a binomial random variable using a Normal distribution?
- Large counts condition (np and n(1-p) ≥ 10)
- Trials are independent OR the sample size n < 10% of the population size N