Chapter 7 VOCAB
Chapter 7 PROBLEMS
Chapter 8 VOCAB
Chapter 8 PROBLEMS
vocab + problems 7&8
100

Fill in the blank? A ___ shows the relationship between two quantitative variables measured on the same


What is a Scatter Plot?

100

Write a single sentence evaluating the following claim: The correlation between the amount of fertilizer used and plant height is −1.50.

Correlation can only be within the range -1 to 1. 

100

Fill in the blank: The ___ gives a value in “y-units per x-unit.” Changes of one unit in x are associated units in predicted values of y. For example interpreting the ___  is by saying " with each additional (x value context) the predicted (y variable) increases or decreases by the ___?

What is slope?

100

A group of environmental scientists is studying the relationship between the number of trees planted in a city park and the reduction in air pollution levels (measured in μg/m³ of particulate matter). They collect data from several parks and calculate the correlation between trees planted (x) and pollution reduction (y) as: r=0.84.

(a) Interpret r and is there is a unit?

(b) Calculate R^2 for this relationship.

(c) Interpret R^2 in context of this study.

a. The linear relationship between the number of trees planted and the reduction in air pollution levels (measured in μg/m³ of particulate matter) is roughly strong and positive, and R does not have any units.

b. (0.84)^2 =0.7056 = R^2 = 70.56%

c.70.56%  of the variation in the reduction in air pollution levels (measured in μg/m³ of particulate matter) is explained by the linear relationship with the number of trees planted.

100

What is this called? y hat = mx + b, or can be written like y hat = b + mx. In the textbook this is the equation: y hat = b0 + b1x

What is Least Squares Regression line?

200

What is this term? A point that does not fit the overall pattern seen in the scatterplot.

What is an Outlier?

200

Write a single sentence commenting on the following assertion: Since the correlation between years of education and annual income is about 0.67, the correlation between annual income and years of education is about −0.67

The correlation remains the same regardless of which variable is considered x or y

200

Fill in the blank: The ____________ gives a starting value in your units. Its the predicted y value when x is 0. For example interpreting the _________  is by saying " When (x = 0 + context) the predicted (y variable) is _______

 What is y intercept?

200

A school counselor is studying the relationship between the number of hours per week students spend studying and their final exam scores in a statistics class. 

Given information: 

  • Mean study time: 8 hours

  • Standard deviation of study time: 2 hours

  • Mean exam score: 78 points

  • Standard deviation of exam scores: 6 points

  • Correlation between study time and exam score: r=0.65


The counselor wants to model exam score as a function of study time using the least squares regression line.

(a) Calculate the slope of the least squares regression line.
(b) Interpret the slope in context of the problem.

(c) Write the least squares regression line equation


(a)

Slope = correlation times Standard Dev of y values divided by standard deviation of x values.

Slope = 0.65 times 6/2

slope = 3.9/2

slope = 1.95

(b) y intercept of LSRL =. mean of y values - (LSRL slope times mean of x values)

y intercept = 78 - (1.95 times 8)

y intercept = 78 - (15.6)

y intercept = 62.4

equation:

 y hat = 1.95x + 62.4

200

How is R interpreted and does it have units?

The linear relationship between the explanatory variable and the response variable is (strength) and (direction), R does not have any units.

300

What is this term? A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.


What is Lurking Variable?

300

The relationship between body length (in feet) and weight (in pounds) for a random sample of grizzly bears is modeled by the regression equation: y hat (weight) = 420x - 2800 

where: 

predicted weight is y hat 

body length (feet) is x

One of the largest grizzly bears ever recorded was measured at 11 feet long and had an actual weight of 1,950 pounds.

Calculate and interpret the residual for this grizzly bear using the given model.

y hat = 420x - 2800

y hat = 420(11) - 2800

y hat = 4620 - 1820

residual = 1950 - 1820

residual = 130 

The actual value of weight is 130 pounds above the  predicted weight for a grizzly bear who is 11 feet long.

300

Fill the blank: The value of y hat found for a given x-value in the data. A ______ value is found by substituting the x-value in the regression equation. The ____ values are the values on the fitted line; the points (x, y hat) all lie exactly on the fitted line.

What is predicted value(s)?

300

A project manager wants to study the relationship between the number of hours employees spend in professional development training each week and their weekly task completion score (out of 100):

  • Mean training time: 5 hours per week

  • Standard deviation of training time: 1.5 hours

  • Mean weekly task completion score: 82 points

  • Standard deviation of weekly task completion score: 7 points

  • Correlation between training time and weekly task completion: r=0.72

The manager wants to model task completion score as a function of training time using the least squares regression line.

Questions:

(a) Calculate the slope of the least squares regression line.
(b) Calculate the y-intercept of the least squares regression line.
(c) Interpret the slope in context of the problem.
(d) Use the model to predict the weekly task completion score for an employee who spends 7 hours per week in training. (round to nearest whole number)

a. slope = correlation times SD of y values divided by SD of x values

slope = 0.72 times 7 / 1.5

slope = 5.04/1.5 

slope = 3.36

b. y intercept = mean of y values - ( slope - mean of x values)

y intercept = 82 - (3.36 - 5)

y intercept = 82 - ( -1.64)

y intercept = 83.64

c. With each additional hour trained the predicted hours trained increases by 3.36.

d. y hat = 3.36x +83.64

y hat = 3.36(7) +83.64

y hat = 23.52 + 83.64

y hat = 107 

An employee who spends 7 hours per week in training will have a predicted weekly task completion score of 107

300

Mia tracks her workouts using a fitness app that records the distance she cycles and the number of calories she burns. She collects data from 18 cycling sessions. A scatterplot of x distance cycled (in miles), y= calories burned shows a strong positive linear relationship. The regression equation that models the data is: y hat = 50 + 90x. a. Interpret the slope of the regression line.
b. Does the y-intercept have a meaningful interpretation in this situation? Explain your reasoning.
c. Predict the number of calories Mia would burn if she cycles 12 miles.
d. Suppose her app reports that she actually burned 1,100 calories on a 12-mile ride. Calculate and interpret the residual.
e. Mia is considering training for a 100-mile cycling event. Her longest ride so far has been 25 miles. Should she use this regression equation to predict the calories burned for a 100-mile ride? Explain.

a.  With each additional mile Mia cycles, the predicted calories burned increases by 90.

b. Yes, When the distance in miles equals 0 the predicted calories burned is 90.

c. y hat = 50 +90(12)

y hat = 50 + 1080 

y hat = 1130. Mia is predicted to burn 1,130 calories on a 12-mile ride.

d. residual = 1,100 - 1,130 

residual = -30

The actual value of the calories burned was 30 below the predicted value of calories burned for x = 12 miles.

e. No, Mia should not use this regression equation to predict calories burned for a 100-mile ride. This would be extrapolation, because 100 miles is far outside the range of her observed data (which only goes up to 25 miles). The linear relationship may not hold for such long distances, making the prediction unreliable.

400

Fill the blank: The ____ is a numerical measure of the direction and strength of a linear association.

What is correlation coefficient?

400

David is a manager at a technology firm and wants to analyze how years of experience relate to annual salary for software developers at his company. He records data for several employees and creates a linear regression model to describe the relationship. Years of Experience: 2, 6, 12, 14. Salary (dollars): 58000, 66500, 74200, 78000.  The regression equation is: y hat=54,300+1,950x.

where

  • x = years of experience

  • y hat = predicted salary (in dollars)

a. What is the residual for an employee with 6 years of experience?
b. Use the model to predict the salary of an employee with 10 years of experience.
c. Use the model to predict the salary of an employee with 28 years of experience.
d. Which prediction should be trusted more, the prediction for 10 years or for 28 years? Explain why.

a. y hat = 54,300+1,950x.

y hat = 54,300 + 1950 (6)

y hat = 54,300 + 11700

y hat = 66000

residual = 66500 - 66000 = 500

The actual value of salary is 500 dollars above the predicted salary for an employee with 6 years of experience. 

b. 

y hat = 54,300+1,950x.

y hat = 54,300 + 1950 (10)

y hat = 54,300 + 19500

y hat =73800

An employee with 10 years of experience is predicted to have a salary of 73,800 dollars.

c. 

y hat = 54,300+1,950x.

y hat = 54,300 + 1950 (28)

y hat = 54,300 +54600

y hat = 108,900

An employee with 28 years of experience is predicted to have a salary of 108,900 dollars.

d. The prediction for 10 years should be trusted more because it is not as far in the future as 28 years. This is because of extrapolation. 

 

400

What is R^2 mean and how is it interpreted?

R2 is the square of the correlation between y and x. R2 gives the fraction of the variability of y accounted for by the least squares linear regression on x. R2 is an overall measure of how successful the regression is in linearly relating y to x.

Interpreted: R^2 (%) of the variation in the (response variable) is explained by the linear relationship with the (explanatory variable)

400

Without the regression line does this residual plot look like it's appropriate for the line?  Mr. F has picture

Yes, the LSRL is appropriate because the residual plot has no pattern

400

A linear regression model was developed to predict the number of goals scored by soccer players based on the number of hours they trained. The regression equation is given by: y hat = 10 + 2x. x is number of hours trained and y is predicted goals scored. A player who trained for 6 hours scored 12 goals in an actual game. What is the residual score for this player?

y hat = 10 + 2(6)

y hat = 10 + 12 

y hat = 22 goals 

residual = 12 - 22 = -10 

The actual value of goals is 10 goals below the predicted value of goals scored when x = 6 hours.

500

How do you describe the relationship between the explanatory variable and the response variable in a scatter plot?

Using DUFS; direction, unusual features, form, and strength 

500

Maria is a fitness coach who wants to analyze how the number of weekly training hours relates to bench press weight (in pounds) for her clients. She records data for several clients and creates a linear regression model to describe the relationship. Weekly Training hours: 3, 5, 8, 10. Bench Press Weight (lbs): 122, 152, 193, 215.

equation: y hat = 105 +11x

where:

  • x= weekly training hours

  • y hat = predicted bench press weight (lbs)

(a) What is the residual for a client who trains 5 hours per week + interpretation?
(b) Use the model to predict the bench press weight of a client who trains 7 hours per week.
(c) Use the model to predict the bench press weight of a client who trains 15 hours per week.
(d) Which prediction should be trusted more, the prediction for 7 hours or for 15 hours? Explain why.

a. y hat = 105 + 11x

 y hat = 105 + 11(5)

y hat = 105 + 55

y hat. = 160  

residual = 152 - 160

residual = -8

The actual value of bench pressed weight (pounds) was 8 below the predicted bench pressed weight when the weekly training hours was 5.

b. y hat = 105 + 11x

y hat = 105 + 11(7)

y hat = 105 +77

y hat = 182 

a client who trains 7 hours per week will predict to bench press 182 pounds.

c. y hat = 105 + 11x

y hat = 105 + 11(15)

y hat = 105 + 165

y hat = 270

a client who trains 15 hours per week will predict to bench press 270 pounds.

d. The 7 hour predicted because 15 hours is to far in the future this is an example of extrapolation

500

Fill in the blank: _____ are the differences between data values and the corresponding values predicted by the regression model—or, more generally, values predicted by any model. This is the equation ______ = y - y hat


What is Residual(s)?

500

A science class investigates how the number of springs attached to a toy car affects the distance the car travels(meters). The results are shown below: Number of springs:0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.  Distance travelled: 8, 11, 15, 19, 24, 27, 31, 36, 48, 63, 87. 

a. Identify which variable is the explanatory variable and which is the response variable. 

b. Create a scatterplot in excel then paste below. Include an appropriate scales, axes, and titles. (Mr. F has picture)

c. Describe the relationship (using dufs: direction, unusual features, form, and strength). 

d. Estimate the correlation (r) of your distribution. What are the units?

e. find the least squares regression line for your data. (excel)

f. Calculate Correlation (excel, don't round)

g. Identify and interpret the slope of the LSRL

h. Identify and interpret the y-intercept of the LSRL (if the y - intercept is invalid in this context state that as well)

I. Calculate and interpret the residual for 4 rubber springs.(round to nearest hundredth)

J. What is R^2 and interpret it


a. Number of Springs is explanatory variable, and Distanced Travelled (meters) is the response variable. 

b. Go to the teams graph and check to see how accurate it is. I sent the correct pic to Mr. F so check teams to see if team is correct. Because I could not add pics on here :(

c. The relationship between the number of springs and the distanced travelled (meters) is roughly strong, positive and linear. There is one clear unusual feature when x is at 10 and the distanced travelled is 87 meters.

d. 0.80, accepting 0.85-0.87, no units

e. y hat = 6.7545x - 0.2273

f. 0.9304671

g. 6.7545, With each additional spring the predicted distance travelled (in meters) increases by 6.7545.

h. -0.2273 , when the number of springs is 0 the predicted distanced travelled in meters is -0.2273. This y intercept is not valid for this problem because it is not possible to have a negative y intercept in this case.

I. y hat = 6.7545x - 0.2273

y hat = 6.7545(4) - 0.2273

y hat = 27.018 - 0.2273

y hat = 26.7907

residual = 24 - 26.7907 

residual = -2.79

The actual value of distanced travelled is 2.79 meters below the predicted distanced travelled for 4 springs. 

J. 0.8658. 86.58% of the variation in the distanced travelled is explained by the linear relationship with the number of springs

500

A school administrator randomly selects 40 textbooks from a storage room and records the number of pages in each book and the weight of each textbook (in pounds). The relationship between x=number of pages, y= weight of the textbook (in pounds) is modeled by the regression equation: y hat=1.2+0.004x  

a. What is the predicted weight of a textbook with 600 pages? (Show your work)
b. Identify and interpret the slope of the regression line.
c. Identify and interpret the y-intercept.

a. y hat = 1.2 + 0.004(600)

y hat = 1.2 +2.4

y hat = 3.6 

A 600-page textbook is predicted to weigh 3.6 pounds.

b. slope is 0.004, with each additional page in the book the predicted weight of the book (in pounds) increases by 0.004.  

c. y intercept is 1.2. When the number of pages in the book is 0 the predicted weight of the book (in pounds) is 1.2. 

 

M
e
n
u