is an observation that lies outside the overall pattern of the other observations
Outlier
CYU Page 168
The linear regression equation weight =100+40(time)
What is the slope of the regression line?
40
You use the same bar of soap to shower each morning. The bar weighs 80 grams when it is new. Its weight goes down by 6 grams per day on average. What is the equation of the regression line for predicting weight from days of use?
What is.. 80 - 6(days)
The fraction of the variation in the values of y that is explained by the least-squares regression of y on x is..
(a) the correlation.
(b) the slope of the least-squares regression line.
(c) the square of the correlation coefficient.
(d) the intercept of the least-squares regression line.
(e) the residual.
What is..
(c) the square of the correlation coefficient.
*Turn to page 204 in the textbook*
Which of the following gives a correct interpretation of s in this setting?
(a) For every 1°C increase in temperature, fish activity
is predicted to increase by 4.785 units.
(b) The typical distance of the temperature readings
from their mean is about 4.785°C.
(c) The typical distance of the activity level ratings
from the least-squares line is about 4.785 units.
(d) The typical distance of the activity level readings
from their mean is about 4.785.
(e) At a temperature of 0°C, this model predicts an ac-
tivity level of 4.785.
What is...
(c) The typical distance of the activity level ratings
from the least-squares line is about 4.785 units.
is a scatterplot of the residuals against the explanatory variable
Residual Plot
*Turn to page 174 of the textbook*
1. Find the residual for the truck that had 8359 miles driven and a price of $31,891. And what is this value in context?
Show your work.
What is..
1. y − y^ = 31,891 − 36,895 = −$5004
2. The actual price of this truck is $5004 less than predicted based on the number of miles it has been driven.
The table below gives a small set of data. Which of the following two lines fits the data better: y^ = 1 − x or y^ = 3 − 2x? Use the least-squares criterion to justify your answer. (Note: Neither of these two lines is the least-squares regression line for these data.)
x: −1 1 1 3 5
y: 2 0 1 −1 −5
What is.. The line y^ = 1 − x
(Why? The sum of squared residuals for this line is only 3, while the sum of squared residuals for y^ = 3 − 2x is 18.)
Which of the following statements is NOT true of the correlation r between the lengths in inches and weights in pounds of a sample of brook trout?
(a) r must take a value between −1 and 1.
(b) r is measured in inches.
(c) If longer trout tend to also be heavier, then r > 0.
(d) r would not change if we measured the lengths of
the trout in centimeters instead of inches.
(e) r would not change if we measured the weights of
the trout in kilograms instead of pounds.
What is...
(b) r is measured in inches
When we standardize the values of a variable, the distribution of standardized values has mean 0 and standard deviation 1. Suppose we measure two variables X and Y on each of several subjects. We standardize both variables and then compute the least-squares regression line. Suppose the slope of the least-squares regression line is −0.44.
We may conclude that..
(a) the intercept will also be −0.44.
(b) the intercept will be 1.0.
(c) the correlation will be 1/−0.44.
(d) the correlation will be 1.0.
(e) the correlation will also be −0.44.
What is...
(e) the correlation will also be −0.44.
y hat is the (blank) value of the response variable y for a given value of x
Predicted
*Turn to page 174 of the textbook*
3. For which truck did the regression line overpredict price by the most?
Justify your answer.
What is..
The truck with 44,447 miles and a price of $22,896. This truck has a residual of −$8120, which means that the line overpredicted the price by $8120.
A) Very little association
B) A weak negative association
C) A strong negative association
D) A strong positive association
D) A strong positive association
A school guidance counselor examines the number of extracurricular activities that students do and their grade point average. The guidance counselor says, “The evidence indicates that the correlation between the number of extracurricular activities a student participates in and his or her grade point average is close to zero.”
A correct interpretation of this statement would be that....
(a) active students tend to be students with poor grades, and vice versa.
(b) students with good grades tend to be students who are not involved in many extracurricular activities, and vice versa.
(c) students involved in many extracurricular activities are just as likely to get good grades as bad grades; the same is true for students involved in few extracurricular activities.
(d) there is no linear relationship between number of activities and grade point average for students at this school.
(e) involvement in many extracurricular activities and good grades go hand in hand.
What is..
(d) There is no linear relationship between number of activities and grade point average for students at this school.
There is a linear relationship between the number of chirps made by the striped ground cricket and the air temperature. A least-squares fit of some data collected by a biologist gives the model ŷ = 25.2 + 3.3x, where x is the number of chirps per minute and ŷ is the estimated temperature in degrees Fahrenheit.
What is the predicted increase in temperature for an
increase of 5 chirps per minute?
(a) 3.3°F
(b) 16.5°F
(c) 25.2°F
(d) 28.5°F
(e) 41.7°F
What is..
(b) 16.5°F
An observation is (blank) for a statistical calculation if removing it would markedly change the result of the calculation.
Influential
CYU Page 168
Read full question
Should you use this line to predict the rat's weight at age 2 years? Use the equation to make the prediction and think about the reasonableness of the result. (There are 454 grams in a pound)
2 years=104 weeks. Y hat= 100+40(104)=4260. This is equivalent to 9.4 pounds. This is unreasonable
What is wrong with this sentence?
The correlation between planting rate and yield of corn was found to to be r = .23 bushel
the correlation R has no units
Other things being equal, larger automobile engines are less fuel-efficient. You are planning an experiment to study the effect of engine size (in liters) on the fuel efficiency (in m/g) of sport utility vehicles. In this study,
A) gas mileage is a response variable, and you expect to find a negative association
B) Gas mileage is a response variable, and you expect to find a positve association
C) Gas mileage is an explanatory variable, and you expect to find a strong negative association
D) Gas mileage is an explanatory variable, and you expect to find a strong positive association
A) gas mileage is a response variable, and you expect to find a negative association
In a statistics course, a linear regression equation was computed to predict the final exam score on the first test. The equation was y hat= 10+0.9x where y is the final exam score and x is score on the first test. Drake Maye scored a 95 on the first test. What is the predicted value of his score on the final exam?
A) 85.5
B) 90
c) 95
D) 95.5
D) 95.5
10+.9(95)
Line of y on x is the line that makes the sum of the squared residuals as small as possible
Least-squares regression
Body Weight(lb): 120 187 109 103 131 165 158
Backpack weight: 26 30. 26. 24. 29. 35. 31
One of the hikers had a residual of nearly 4 pounds. Interpret this value
The backpack for this hiker was almost 4 pounds heavier than expected based on the weight of the hiker
MPG= 4.62 + 1.109(city mpg)
Find the predicted highway mileage for a car that gets 16 miles per gallon
22.36
Sarah’s parents are concerned that she seems short for her age. Their doctor has the following record of
Sarah’s height:
Age (months): 36 48 51 54 57 60
Height (cm): 86 90 91 93 94 95
(b) Using your calculator, find the equation of the least squares regression line of height on age.
(c) Use your regression line to predict Sarah’s height at age 40 years (480 months). Convert your prediction to inches (2.54 cm = 1 inch).
(d) The prediction is impossibly large. Explain why this happened
What is...
(b) ŷ = 71.95 + 0.3833x, where y = height and x = age.
(c) 100.76 inches (255.934 cm)
(d) This was an extrapolation. The data was based only on the first 5 years of life and the linear trend will not continue forever.
*Turn to page 205 in textbook*
Drilling down beneath a lake in Alaska yields chemical evidence of past changes in climate. Biological silicon, left by the skeletons of single-celled creatures called diatoms, is a measure of the abundance of life in the lake. A rather complex variable based on the ratio of certain isotopes relative to ocean water gives an indirect measure of moisture, mostly from snow. As we drill down, we look further into the past. Here is a scatterplot of data from 2300 to 12,000 years ago:
(a) Identify the unusual point in the scatterplot. Explain what’s unusual about this point.
(b) If this point was removed, describe the effect on
i. the correlation.
ii. the slope and y intercept of the least-squares line.
iii. the standard deviation of the residuals.
What is....
(a) The point in the upper-right-hand corner has a very high silicon value for its isotope value.
Parts B:
(i) r would get closer to −1 because it does not follow the linear pattern of the other points.
(ii) Because this point is “pulling up” the line on the right side of the plot, removing it will make the slope steeper (more negative) and the y intercept smaller (note that the y axis is to the right of the points in the scatterplot).
(iii) Because this point has a large residual, removing it will make s a little smaller.