What do you need to list when describing a scatterplot?
Direction
Form
Strength
Unusual Features
True or False
Random scatter in the residuals indicates a model with high predictive power.
False. Random scatter in the residuals indicates a linear model. We don't know how strong it is.
What do you call a data point that does not fit with the rest of your data points?
Outlier
A scatterplot of log(Y) vs. log(X) reveals a linear pattern with very little scatter. It is probably true that ... (choose one)
A) the correlation between X and Y is near +1.
B) the correlation between X and Y is near 0.
C) the scatterplot of Y vs X shows a linear association.
D) the residuals plot for regression of Y on X shows a curved pattern.
E) the calculator’s LnReg function will model the association between X and Y.
D
If you have a z-score of -1.6. That means you are 1.6 (blank) below the mean.
Standard Deviations
True or False (Explain Why)
If there is a correlation of 0 between two quantitative variables, that means that there is no association between the two variables.
False. Correlation is only for linear relationships. If the relationship is non linear, the correlation could equal 0 but have a strong association.
Over the past decade a farmer has been able to increase his wheat production by about the same number of bushels each year. This is a (blank) relationship. Another farmer has increased his wheat production by about the same percentage each year. This is a (blank) relationship.
Linear
Exponential
What do you call a variable that affects both the x and the y variable?
Lurking variable
Predict y when x is 4
ln(y)=5-0.23(logx)
72,698.63
In school 1, 57 kindergarteners were tested and their mean score was 82. In school 2, 23 kindergartners were tested and their mean score was 63. What was the mean of all the kindergartners together?
76.5
What are 2 words used for the x variable and 2 words used for the y variable?
X
Explanatory
Predictor
Y
Response
Dependent
A cooking competition rated each participant's dish on both appearance and taste according to a 10 point scale. Your dish scored a 2 out of 10 on appearance, so it was predicted that it would only get a 4 out of 10 on taste. You actually scored an 8 on taste. What is the residual?
4
What is it called when you try to predict a data point not in the range of your scatterplot?
Extrapolation
How do you reverse the following...
1. Log (x) = 2
2. ln (x) = 2
3. 1/x = 2
4. x^2 = 2
5. square root of x = 2
1. 10^2
2. e^2
3. 1/2
4. square root of 2
5. 4
What percent of data are between -1 and 1 standard deviations?
What percent of data are between -2 and 2 standard deviations?
What percent of data are between -3 and 3 standard deviations?
68%
95%
99.7%
4 of these statements has a mistake. State which one is correct and explain why the other 4 are wrong
1. The correlation between height and weight is 0.568 inches per pound
2. The correlation between weight and length of foot is 0.488
3. The correlation between the breed of a dog and is weight is 0.435
4. The correlation between a person's age and vision (20/20?) is r=-1.04
5. If the correlation between blood alcohol level and reaction time is 0.73, then the correlation between reaction time and alcohol level is -0.73
1. Correlation does not have units
2. CORRECT
3. Breed of dog is categorical
4. Correlation has to be between 1 and -1
5. Correlation between x and y is the same as correlation between y and x
What is a negative residual? What is a positive residual? Is one of them better than the other? Why?
A negative residual is when the actual value is lower than predicted. A positive residual is wen the actual value is higher than predicted. Neither is better than the other. It depends on context.
What do you call a data point that has a very different x value than the rest of your data points?
Leverage point
Predict the intensity when distance is 12?
1/(square root of intensity) = 0.00006+0.022(Distance)
14.8
Quiz Scores (Out of 15)
Mean = 10.95 points
s = 2.481 points
min = 4
Q1 = 9.5
med = 12
Q3 = 12
max = 15
A teacher multiplies each score by 6 and adds 10. Find the new median, IQR, mean, and standard deviation.
Median = 82
IQR = 15
Mean = 75.7
Standard Deviation = 14.866
1. A couple of years ago, a local newspaper published research results claiming a positive association between the number of years high school children had taken instrumental music lessons and their performances in school. What does "positive association" mean in this context?
2. A group of parents then went to the School Board demanding more funding for music programs as a way to improve student chances for academic success in high school. Do you agree or disagree with their reasoning? Why?
1. The more years high school children take instrumental music, the better their performance at school.
2. Disagree. Correlation does not equal causation. Just because there is a relationship between two variables, doesn't mean one of them is causing the other.
Data collected from internet ads for 2010 Kia were used to create a model to estimate the asking price of a car based on the number of miles it had been driven.
R-Squared = 0.46
Predicted Price = 15327 - 0.11(Miles)
Answer the following questions
1. What is the slope in context?
2. What is the intercept in context?
3. What is R-Squared in context?
4. What is the correlation coefficient?
1. For every mile, the price decreases by 11 cents.
2. When there is no miles in a car, the price is $15,327.
3. 46% of the variation in price is explained by the variation in mileage.
4. 0.678
State whether the following statements about influential points are true or false:
1. Influential points have small residuals.
2. Outliers in the vertical direction are more likely to be influential points than outliers in the horizontal direction.
3. Influential points change the regression lion.
True
False
True
Using the following information, predict the student population in 2014.
Explanatory Variable: Years since 2000
Dependent Variable: log(students)
Sample size: 6
R-sq=0.994
Parameter l Estimate l Std. Err.
Constant l 2.871 l 0.0162
Year l 0.0389 l 0.00152
2604 students
Given the table on the board
1) What is the marginal distribution (in %) of college major
2) What is the conditional distribution (in %) of college major for men?
3) What is the conditional distribution (in %) of college major for women?
1
Biology = 37%
Physics = 27%
Chemistry = 37%
2
Biology = 44%
Physics = 18%
Chemistry = 38%
3
Biology = 25%
Physics = 41%
Chemistry = 34%