What's the difference between the population mean and the sample mean?
Population mean is for entire group, sample mean is for sample of that group
If I have a bell curve, what rule can I assume applies? (it starts with an e)
Emperical rule
These are the two components of a distribution
What are relative frequencies and possible values?
Suppose I record the price of every pizza in Iowa City and fit a model that shows Y = 15 + 3*X, where Y is price and X is the number of toppings. If I add a topping to a pizza, will the price go up $3?
No: correlation does not imply causation
What are the possible values of a relative frequency or proportion?
Between 0 and 1
What are the possible values of the correlation r? When does it apply?
-1 to 1, applies when you have bivariate data, both quantitative
I have a normal distribution with mean = 30 and standard dev = 5. What is AUC(x <= 30)?
The possible values of rolling two die and counting the number of updots
What are 2, 3, ... 12?
Suppose your TA fits a linear model of Price and Toilet Paper Holders and gets an r-squared value of 0.62. Interpret the r-squared.
62% of the variation in price is explained, for whatever reason, by the variation in toilet paper holders
Qualitative ordinal
Your TA is 61 inches tall. Suppose she wants to calculate her Z score. What information does she need for that formula?
I have that AUC (x <= 5.4) = 0.6. What is the 60th percentile?
5.4
The type of distribution where the mode is greater than the median which is is greater than the mean
What is a left skewed distribution?
Suppose I have an r-squared of 1 and I know 2 of my data points are as follows:
(30, 35)
(31, 34)
What is the correlation?
-1
Suppose your TA takes a sample of animals in her neighborhood. Out of 10 animals, 3 are tan dogs. 3/7 animals are dogs given that they are tan. What proportion of animals are tan?
HINT: You can use that P(A | B) = P (A and B) / P (B)
7/10 animals are tan
Let's say your TA does a study of Starbucks orders. She checks if they ordered a specialty drink (yes/no) and if they bought a food item (yes/no). What is the formula for LIFT for specialty drink? Suppose my LIFT is 3.2. Interpret that.
Prop ( food = yes | specialty drink = yes) / prop (food = yes)
Of orders that include a specialty drink, the proportion that include food is 3.2 times higher than those that don't.
Suppose I know 80% of the AUC is between 3 and 5. What percentile is 3?
10th percentile
Income has a mean of 30,000 and a standard deviation of 40,000. What number summary should we use?
5 number summary
Suppose I have the equation for the LSE line Y = b0 + b*X. I also have the data point (a, c). What would I do to calculate the residual?
Plug a into the LSE line and subtract that from c
My histogram bin is [30,40). I have the following data:
30, 30, 35, 36, 37, 40, 40, 32, 40, 46
What is my density?
0.06
Milhouse scored .75 standard devations below the mean on B1. The correlation between B1 and B2 is 0.6. How many standard deviations above or below the mean should Milhouse expect to score?
0.6*0.75 sd below the mean
I can model the price of thowing a party with b + cX + (a+d)Y. Suppose b = 4, c = 1, a = 2, d = 3, x bar = 6, y bar = 1. What is my mean?
HINT: use the fact that mean(a + bX + cY) = a + b* x bar + c * y bar
HINT: be very careful about what you choose for a, b, and c!
a = 4, b = 1, c = 5
4 + 6 + 5 = 15
Let's say your TA does a study of Starbucks orders. She checks if they ordered a specialty drink (yes/no) and if they bought a food item (yes/no). What are the possible values of the marginal distribution of Specialty Drink | food item?
Specialty drink yes/no
Let's say I need to calculate a slope and I know rx, sx, and sy. What slide do I need to go to for the formula?
L4.1 slide 16
This is the time your TA is in bed and you shouldn't count on an email response
What is 10pm?