Describing Scatterplots & Correlation
Least Squares Regression Line
Residuals
r^2 and Standard Deviation
Outliers & Extrapolation
100

When describing a scatterplot, what are the four characteristics you must mention?

Direction, Form, Strength, and Unusual Features (Outliers).

100

In the equation {y} = a + bx, what does the "hat" over the y signify?

It represents the predicted value of y.

100

What is the formula for calculating a residual?

Actual - Predicted (y-yhat)

100

What is the name of the statistic r^2?

The Coefficient of Determination.

100

What is the term for predicting a y-value using an x-value that is far outside the range of your data?

Extrapolation.

200

What is the range of possible values for the correlation coefficient (r)

Between $-1$ and $1$.

200

Define the "Least Squares" part of the LSRL.

It is the line that minimizes the sum of the squared residuals.

200

If a data point is located below the regression line, is its residual positive or negative?

Negative.

200

If r = 0.7, what is the value of r^2?

0.49 (or 49%).

200

Does a high correlation (r) prove that x causes y?

No. Correlation does not imply causation.

300

If r = -0.95, describe the strength and direction of the relationship.

Strong and negative.

300

If the LSRL is {Score} = 50 + 5 Hours interpret the slope of $5$ in context.

For every 1 additional hour studied, the predicted score increases by 5 points.

300

What should a "good" residual plot look like if a linear model is appropriate?

Randomly scattered points with no clear pattern or curve.

300

Interpret r^2 = 0.85 in context.

85% of the variation in the response variable (y) is explained by the linear relationship with the explanatory variable (x).

300

What do we call a point that, if removed, would significantly change the slope or y-intercept of the LSRL?

An influential point.

400

What happens to the correlation (r) if you switch the x and y variables?

t stays exactly the same.

400

What is the interpretation of the y-intercept (a) in a regression line?

The predicted value of y when x is zero.

400

If a residual is +10, did the model overpredict or underpredict the actual value?

Underpredict (the actual value was 10 units higher than predicted).

400

What does the standard deviation of the residuals (s) represent?

The typical distance that the actual y-values are from the predicted y-values.

400

A point that is an outlier in the x-direction (far to the left or right) is said to have high what?

High leverage.

500

Correlation only measures the strength of what specific type of relationship?

Linear relationships.

500

Every LSRL must pass through which specific coordinate point?

The point of averages bar{x}, bar{y}).

500

If a residual plot shows a clear "U-shaped" pattern, what does that indicate?

A linear model is not appropriate (the data is likely curved).

500

If r^2 is 0.64 and the slope of the regression line is negative, what is the value of r?

-0.8 (You must take the negative square root because the slope is negative).

500

What is the difference between an outlier and an influential point in regression?

An outlier has a large residual (far from the line); an influential point specifically changes the "tilt" (slope) of the line.