One-Variable Data
Density Curves/Normal Distributions
Scatterplots & Correlation
Linear Regression
Probability Basics
Conditional Probability & Independence
Random Variables
100

Which measure of center is most resistant to outliers?

Median

100

What is the total area under any density curve?

1

100

What four features should always be described when analyzing a scatterplot?

Direction, form, strength, and outliers

100

What does the slope of a least-squares regression line represent?

The predicted change in y for a one-unit increase in x

100

What is the probability of an impossible event?

0

100

What is conditional probability?

The probability of an event given that another event has occurred

100

What is the difference between a discrete and continuous random variable?

Discrete takes countable values with gaps between them; continuous takes all values in an interval

200

Name 2 types of displays used for a quantitative variable.

Histogram, dotplot, stemplot

200

If a distribution is symmetric and bell-shaped, how to the mean & median compare?

They are approximately equal

200

What does a correlation of r = -0.85 indicate?

A strong negative linear relationship

200

What is a residual?

The difference between the observed value and the predicted value

(y-hat y)

200

Two events are mutually exclusive. What is P(A ∩ B)?

0

200

Write the formula for conditional probability P(A | B)

P(A | B) = P(A ∩ B) / P(B)

200

What must be true about the probabilities in a probability distribution? (2 requirements)

They must all be between 0 and 1, and they must add up to 1

300

What does the IQR measure?

The spread of the middle 50% of the data

300

In a Normal Distribution, what percent of data falls within two standard deviations of the mean?

About 95%

300

True or false: Correlation is resistant to outliers

False

300

What does r2 represent?

The proportion of variability in y explained by the variability in x (or explained by the linear relationship with x) 

300

State the general addition rule for two events A and B

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

300

If P(A | B) = P(A), what does that indicate?

Events A and B are independent

300

What are the 4 characteristics that define a binomial random variable?

Binary outcomes, independent trials, number of trials is fixed, same probability of success in each trial

400

If a distribution is strongly right-skewed, what is the relationship between the mean and the median?

The mean is greater than the median

400

What does a z-score represent?

The number of standard deviations a value is from the mean

400

Why is it incorrect to say that correlation implies causation?

Because a strong association may be due to lurking (confounding) variables or coincidence

400

Why should you not use a regression model to extrapolate far beyond the data?

Because the relationship may not continue outside the observed range

400

What does it mean for two events to be mutually exclusive?

They cannot happen at the same time

400

Two events are independent if P(A ∩ B) = ...

P(A) * P(B)

400

How does multiplying a random variable by a constant affect its standard deviation?

The standard deviation is multiplied by (the absolute value of) the constant

500

Describe the four characteristics you should include when describing a quantitative distribution

Shape, center, spread/variability, outliers

500

Describe how to determine the proportion of observations below a given value in a Normal distribution.

Convert the value to a z-score and use a Normal table or calculator

500

Describe a situation where two variables might have a strong correlation but to causal relationship

Example: Ice cream sales and shark attacks (both affected by warm temperatures)

500

What does s represent for a linear regression between two variables?

The standard deviation of the residuals. How much the predicted values differ from the actual values on average.

500

What does it mean for two events to be independent?

The outcome of one event does not affect the probability of the other event

500

True or false: Two events can be both independent and mutually exclusive

False

500

What two conditions must be met to model a binomial random variable using a Normal distribution?

- Large counts condition (np and n(1-p) ≥ 10)

- Trials are independent OR the sample size n < 10% of the population size N