The chance of a single event occurring, expressed as a number between 0 and 1.
Probability
A statistic used to estimate a parameter
Point Estimator
A part of the population that is actually observed.
A sample.
A measure of central tendency found by summing all values and dividing by the number of values.
The mean.
A visual representation of the relationship between two quantitative variables.
A scatterplot
What rule would you use to calculate the probability of either A or B happening when A and B are not mutually exclusive?
The general addition rule
What is a large enough sample size (n is greater than or equal to 30) or population distribution is normal?
What is the key difference between an experiment and an observational study?
An experiment imposes treatments, observational studies do not.
A boxplot or a histogram.
A straight line that minimizes the sum of the squared vertical distances from the data points to the line.
least-squares regression line
Two events are independent. P (A) = 0.4 and P (B) = 0.6, what is P (A and B)?
0.24
You take many random samples of size 50 from a population. What will the standard deviation of the sample means be compared to the population standard deviation?
What is smaller by a factor of the square root of n?
Name one method that reduces bias when sampling a population.
Random sampling.
In a skewed distribution, which measure of center is more appropriate and why?
The median, because it is resistant to outliers.
If the correlation between two variables is 0.85, describe the direction and strength.
Strong and positive.
If the probability of rain is 0.7 and the probability of being late given that it rains is 0.8, what is the joint probability of both?
0.56
Explain the difference between standard deviation and standard error
Standard deviation measures variability in a population, while standard error measures variability of a statistic from sample to sample.
What design ensures each treatment appears in each block exactly once?
A randomized block design.
What does a z-score represent?
The number of standard deviations a value is from the mean.
The slope of the least-squares regression line is 2.3. What does this mean in context?
For every unit increase in the explanatory variable, the response variable increases by 2.3 units.
A company has two machines. Machine A produces 70% of parts and 2% are defective; Machine B produces 30% and 5% are defective. A part is selected at random and found to be defective. What is the probability it came from Machine B?
0.39 (Use Bayes' Theorem)
For a proportion, what is the formula for the standard error and when is it valid?
What is sqrt[(p(1-p))/n], valid when np is greater than or equal to 10 and n(1-p) is greater than or equal to 10?
What is the purpose of a double-blind experiment, and what kinds of bias does it reduce.
What is to prevent subject and experiment bias, such as the placebo effect and observer bias?
A data set has a mean of 60 and a standard deviation of 5. What is the 95% confidence interval for a single observation assuming normality?
(50,70) (using empirical rule)
You calculate r^2=0.64. What percent of the variation in the response variable is explained by the model.
64%