Your data provides a predicted value of 58.74 and an observed value of 98.12. Does your model overestimate or underestimate the value at this point? Why?
Underestimate - the predicted value is below the observed.
What is the equation for linear regression?
y=mx+b
f(x)=ax+b
What does r2 tell us?
Measures how well the graph of the regression fits the data
Indicate what association you expect for the pair of variables: positive, negative or none:
a person’s blood alcohol level; time it takes the person to solve a maze
Positive
Describe what correlation coefficient tells us
Indicates the direction (positive or negative) and strength of the relationship that may exist for a given set of data points.
Predicted Price = 18.617 + 103.929x (x = Capacity). This is the regression equation for disk space Capacity (in megabytes) versus Price at a local store. With a capacity of 2 mb, what would you expect the price to be?
$226.475
The correlation between a cereal's fiber and potassium is r=0.903. What is the percentage of variation described by our linear regression equation?
0.8154
OR
82%
Indicate what association you expect for the pair of variables: positive, negative or none:
the price charged for fund-raising candy bars; number of candy bars sold
Negative
Describe a correlation of -0.9 in terms of strength (weak, moderate, or strong) and direction.
Strong, negative.
How can you tell if a linear model is an appropriate fit for the data?
The r-value is high, the line is a good fit to the points, and there is no pattern in the residual plot.
Two variables have a correlation of -0.89. What is the R-squared?
0.7921
Or
79%
Indicate what association you expect for the pair of variables: positive, negative or none:
the number of miles a student lives from school; the student’s grade point average
None
Which graph has a stronger correlation? (1,2,3)



2
Given the regression model: Predicted Verbal SAT Score = 171.333 + 0.6943Math SAT Score. Would you rather have a positive or negative residual? Why?
Positive - you always want a higher score than the predicted.
The linear model for a local store's Number of Sales people working versus Sales is as follows: Sales = 8,106 + 91.34x (x = Number of Sales People Working). With 14 people working, what would you expect sales to be?
$9,384.76
The linear regression of Price versus Size in Texas was used to predict price, and had an R-Sqaured value of 71.49%. Does this indicate a positive or negative correlation? Why?
Positive, bigger homes will cost more.
What does it mean for two variables to have a negative association?
As one variable increases the other variable decreases.
A restaurant's menu items are compared in terms of correlation. Sugar versus Calories have a correlation of 0.25. Sugar versus Protein has a correlation of -0.68. Which has a stronger correlation?
Sugar versus Protein.
The linear model for a local store's Number of Sales people working versus Sales is as follows: Sales = 8,106 + 91.34x (x = Number of Sales People Working). If two people are working (your x), the observed value is $8600. What is the residual?
$311.32
An analysis of Math SAT versus Verbal SAT scores gives an equation of Predicted Verbal SAT Score = 171.333 + 0.6943x (x = Math SAT Score). What would you predict someone's verbal score to be if they got a 520 on their Math section?
532.369
A regression analysis of students’ AP* Statistics test scores and the number of hours they spent doing homework found r2 = 0.32 . What does this mean?
32% of the variability in students' test scores can be explained by a linear relationship with the number of hours they spent doing homework.
As the age of the car increases, its value decreases. Which scatterplot represents this relationship?
C
A student says, "There was a very strong correlation of 1.22 between Sugar and Fat content." Explain the mistake made here.
Correlation is between 1 or -1