Type of data in which the data can be separated into groups. (example: types of pizza toppings)
Categorical Data
Complete the missing cell: Table showing Course Enrollment
Algebra Geometry Total
9th 25 ? 60 10th 30 10 60 Total 55 40 120
Find the missing value for Geometry in 9th grade.
35
Elena ran a 5k, or 5 kilometers, on 14 different days last month. The scatter plot shows the time in minutes it took her to finish the run and her heart rate in beats per minute after the run.
The scatter plot includes a point at (31, 155). Describe the meaning of this point in this situation.
After the run 31 minutes, her heart rate was 155 beats per minute.
If a model predicts a value of 12 and the actual value is 18, this is the value of the residual.
6
A correlation coefficient of +1 indicates this type of relationship.
Strong positive relationship
The relationship is ________________ when an increase in the data for one variable tends to be paired with an increase in the data for another variable.
positive
Given this frequency table:
Sports Music None Total
Pass 40 10 50 100
Fail 20 15 15 50
Total 60 25 65 150
What fraction of all students failed and do Music?
15/150 = 1/10
A seed is planted in a glass pot and its height is measured in centimeters every day.
The best fit line is given by the equation
y = 0.404x − 5.18, where y represents the
height of the plant above ground level, and x represents the number of days since it first sprouted. What does the slope of the line mean in this situation?
The seeds height increases 0.404 centimeters each day.
A student gets a negative residual. What does this tell you about the estimate compared to the actual value.
The estimate was higher than the actual value
If the correlation coefficient is close to 0, this is what it tells you about the relationship between the variables.
There is no correlation or a very weak correlation between the variables.
A number between -1 and 1 that describes the strength and direction of a linear association between two numerical variables.
correlation coefficient
Given this frequency table:
Sports Music None Total
Pass 40 10 50 100
Fail 20 15 15 50
Total 60 25 65 150
What percent of students that do not do an extra curricular listed passed? Round to the nearest whole percent.
50/65 = 77%
A seed is planted in a glass pot and its height is measured in centimeters every day. The best fit line is given by the equation y = 0.404x − 5.18, where y represents the height of the plant above ground level, and x represents the number of days since it first sprouted.
What does the y-intercept of the line mean in this situation?
the seed is at -5.18 centimeters at 0 days. (below ground)
A scatterplot shows most residuals close to zero. What does it tells you about the model.
The line of best fit makes good predictions.
You calculate a correlation of –0.85 for hours of TV watched and homework completion rate. What is the direction and strength of the relationship.
Strong negative relationship
The difference between the y value for a point in a scatter plot and the value predicted by a linear model.
Residual
Is there an association between working in the office and ordering lunch out.
office work at home
order lunch 476 215
lunch at home 178 269
Yes there is an association. Majority of people that work in the office order lunch but the majority of people who work from home eat lunch at home.
At a restaurant, the total bill and the percentage of the bill left as a tip are represented in the scatter plot.
The best fit line is represented by the equation y =− 0.632x + 27.1, where x represents the total bill in dollars, and y represents the percentage of the bill left as a tip.
What does the best fit line estimate for the percentage of the bill left as a tip when the bill is $15? round to the nearest whole percent.
y= -0.632(15) + 27.1
17.62% ~ 18%
A model predicts that a student who studies 4 hours will score 78. The residual is -8. What was the actual score?
70
A scatterplot of study hours and test scores has a correlation coefficient of 0.72. Explain what it suggests about the data.
There is a relatively strong positive relationship between hours studied and test scores. As the number of hours studied increased, the test scored increased. The best fit line represents the data well.
When the data in a scatter plot is loosely spread around the best fit line.
weak relationship
Is there an association between ice cream sales and snowy condition?
Sunny day snowy day total
<50 cones 8 7 15
>50 cones 22 4 26
total 30 11 41
From the column relative frequency table, it is clear that most of the sunny days resulted in sales of at least 50 cones (73%), while most of the snowy days resulted in fewer than 50 cones sold (64%). Because these percentages are quite different, this suggests there is an association between the weather condition and the number of cone sales.
A class tracks how many hours per week students spend practicing piano and their corresponding test scores on a music theory quiz. The line of best fit for the data is:
Score=4.5h+62
What does the slope of the line (4.5) represent in this context?
The model predicts that a students score will increase by 4.5 points for each additional hour they practice.
A linear model predicts the number of points a basketball player will score using the equation
Points=2.3h+10
If a player practices for 5 hours, the model predicts 21.5 points. The residual for 5 hours of practice is 7.5. Explain what it tells you about the model’s prediction.
The estimated amount of points after 5 hours of practice was 7.5 points below the actual amount of points the player scored.
A researcher finds a correlation coefficient of –0.12 between the number of hours students sleep and their math test scores. Explain what this correlation says about the relationship and why it may not be meaningful.
It shows a very weak negative correlation, meaning there is almost no linear relationship between sleep hours and math scores, so the correlation is likely not meaningful or useful for prediction.