Back to Basics
Probability is the law
You're probably sampling
How confident are you?
Infer me this
100
The study of collecting, organizing, analyzing, and interpreting data, often in large quantities.
What is Statistics?
100
When 25% of the marbles in a bag are damaged, 30% are heavy, and 40% are either damaged or heavy, this is the probability that if you draw one at random it will be a heavy damaged marble.
What is 0.15? Recall P(H or D) = P(H) + P(D) - P(H and D), so 0.4 = 0.3 + 0.25 - P(H and D), so P(H and D) = 0.15.
100
This is the mean and standard deviation of the sample average xbar from a sample of n observations from a population with mean mu and standard deviation sigma.
What is mu (for the mean) and sigma/sqrt(n) (for the sd)
100
This is when we use a t-distribution instead of a standard normal random variable in constructing a confidence interval.
What is when the true population standard deviation is unknown? Note: We need large sample sizes or a normally distributed population.
100
This is a p-value.
What is the probability that assuming the null hypothesis is true, that we observe as extreme of a sample average as we have just observed in our sample.
200
DOUBLE JEOPARDY!! You may wager anywhere from $0 to $1000, in increments of $200.










































It does NOT imply causation.
What is correlation? Remember, two variables can be highly correlated but this does not imply that a change in one CAUSES a change in the other.
200
These are the two definitions of independence of two events A and B. (One mathematical and one in English)
What is P(A and B) = P(A)P(B), as well as the notion that knowledge about A does not affect the probability of B and vice versa.
200
This is the Central Limit Theorem.
What is a statement which says that the sample average ("x bar") is approximately normally distributed with mean mu and standard deviation sigma for large sample sizes. Here, mu and sigma are the mean and standard deviation of the entire population.
200
DOUBLE JEOPARDY!! You may wager anywhere from $0 to $1000, in increments of $200.










































This is the margin of error for the confidence interval (8.4, 9.9).
What is 0.75? [The radius of the interval]
200
This is the probability of making a Type I error in a significance test.
What is the significance level?
300
It fits quantitative data to a line in a way that minimizes the distances between the data points and the line and can be used to make predictions.
What is linear (least-squares) regression? Follow up: residuals, prediction, extrapolation, scatter plots, correlation and the slope of a regression line
300
This is the number of ways to select k objects from n objects when the order of selection does not matter.
What is "n choose k"? This is equal to n! / ( k!(n-k! ). [See Midterm 2, Midterm 2 practice problems, and homework for examples where we use these values]
300
This is the definition of a binomial random variable Bin(n,p).
What is the number of successes out of n independent and identical trials where the probability of success in each trial is p. Equivalently, it is the sum of n independent random variables, each equal to 1 with probability p. [What are some examples?]
300
This is the probability that a random variable T with the t-distribution with 10 degrees of freedom is at most -1.372.
What is 0.10? Use the symmetry of the t-density (draw a picture): P(T < -1.372) = P(T > 1.372) = 0.10 from the table. Follow up: What is P(T < 2.359)? [0.98]
300
These are the Type I and Type II errors.
What is (type I error) the wrong decision that is made when a test rejects a true null hypothesis (H0), and (type II error) the wrong decision that is made when a test fails to reject a false null hypothesis.
400
Six of the plots that we have learned to display data.
What are stemplots, scatter plots, boxplots, pie charts, histograms, bar graphs? [Practice creating and interpreting all these plots]
400
It's what you might use if you knew P(A|B) but were trying to find P(B|A).
What is Baye's rule? [See Section 4.5 for details]
400
This is the probability that if 13% of swans are black, that exactly 10 will be black when you select 100 at random.
What is 0.086? Observe that the number of black swans is Bin(100,.13). Call this number X. Then P(X=10) = (100Choose10)*(.13)^10*(1-.13)^90 = 0.086. Follow up: What's the probability that at most 3 are black? [P(X<=3) = P(X=0)+P(X=1)+P(X=2)+P(X=3)=0.0006]
400
This is the interpretation of a 96% confidence interval for mu of (8.4, 9.9).
What is "We are 96% confident that the true value of mu lies between 8.4 and 9.9. Meaning, if we were to draw samples repeatedly and calculate confidence intervals for each sample, we expect 96% of them to contain the true value for mu"
400
This is the power of a significance test.
What is the probability that you will reject the null hypothesis given that some other alternative is true.
500
In the two way table, Smoker? Men Women Yes 1630 1684 No 5550 8232 this distribution is: Men Women Proportion 0.42 0.58
What is the marginal distribution of gender? [Practice also: joint distributions, conditional distributions, Simpson's paradox]
500
These are the mathematical and intuitive definitions of conditional probability, for example P(A|B).
What are P(A|B) = P(A and B) / P(B), and intuitively, P(A|B) is the probability that A occurs given that B has occurred.
500
This is an example of a random variable X for which P(X=a) is zero for all numbers a.
What is any continuous random variable? For example, uniform random variable or normal random variable [anything with a density!]
500
Let mu1 be the true population mean weight of Farmer John's pigs and mu2 be the true population mean weight of Farmer Henry's pigs. This is the interpretation of a confidence interval for mu1 - mu2 of (12.1, 36.5) lbs.
What is Farmer John's pigs weigh more? In fact, we can be confident that on average they weigh at least 12.1 lbs more!
500
A two-sided significance test with significance level 0.02 will reject the null hypothesis that mu=7 only when this is true about a confidence interval.
What is when the 98% confidence interval does not contain 7.