What are the three types of Chi-Square tests, and how do you distinguish between them based on the number of samples and variables?
Goodness of Fit, Homogeneity, & Independence/Association
What is the basic formula for finding the degrees of freedom df in a simple one-sample t-test?
n - 1
What does a correlation coefficient of r = 0 mean about the relationship between two variables?
There is no linear relationship between the variables.
If a scatterplot looks like a curved banana shape instead of a straight line, is a linear model appropriate?
No. Linear models should only be used if the original scatterplot looks straight.
What is the name of the fake, sugar-pill treatment given to a control group in an medical experiment?
A Placebo.
What is the formula for calculating the expected count for any cell in a two-way table during a Chi-Square test for Independence?
(row total x column total)/ table total
How do you determine the degrees of freedom for a standard One-Sample t-test for a mean, and what happens to the shape of the t-distribution as the degrees of freedom increase
df = n - 1. As degrees of freedom increase, the t-distribution becomes taller and narrower
If your actual data point is y = 10 and your regression line predicted y^ = 8, what is the value of your residual?
+2
residual = actual - predicted = 10 - 8 = 2
What math tool do statisticians most commonly use to turn a curved, non-linear dataset into a straight line?
Logarithms
What critical step must an experiment include to prove that a treatment actually caused a change?
Random Assignment of subjects to treatment groups.
State the specific sample size condition required to safely perform a Chi-Square test.
Large Counts Condition: All expected counts must be at least 5
If your sample size is 50, do you need to check a graph of the data for normality before running a t-test? Why or why not?
No. Because n is greater than or equal to 30, the Central Limit Theorem guarantees the sampling distribution is approximately normal.
What is the name of the line that minimizes the sum of the squared residuals in a scatterplot?
The Least-Squares Regression Line (or Line of Best Fit).
If you look at a residual plot and see a clear U-shaped pattern, what does that tell you about your regression line?
The linear model is not a good fit for the data.
If a researcher only hands out surveys to their friends because it is fast and easy, what type of sampling method are they using?
Convenience Sampling.
If a Chi-Square test for Independence is conducted on a contingency table with 4 rows and 3 columns, how many degrees of freedom does the test statistic have?
df = (rows - 1)(columns - 1) = (4 - 1)(3 - 1) = 3 x 2 = 6
If you calculate a 95% confidence interval for a mean to be (12, 18), does the value 10 fall inside or outside your plausible range?
Outside. (It is below the lower bound of 12).
If the slope of a regression line is negative, will the correlation coefficient (r) be positive or negative?
Negative. (The slope and the correlation coefficient always share the same sign)
If you look at a scatterplot of your original data and it forms a perfect, straight line, do you need to transform the data?
No. Transformations are only used if the original data is curved.
Why is a completely voluntary internet poll usually highly biased?
Voluntary response bias
If a Chi-Square test results in a huge test statistic and a p-value of $0.0001$, what is your conclusion regarding the null hypothesis?
Reject the null hypothesis. (There is strong evidence of a relationship/difference)
If you weigh a group of people before a diet and weigh the same exact people after the diet, what specific type of t-test should you use?
A Matched Pairs t-test.
If a computer output tells you that R2 = 0.81, what is the strength of the linear relationship (strong, moderate, or weak)?
Strong.
(Taking the square root gives an r value of 0.90 or -0.90, which is very close to 1 or -1).
If Model A has a standard deviation of residuals (S) of 1.2 and Model B has an S of 5.4, which model makes more accurate predictions?
Model A.
A smaller standard deviation of residuals means the actual points are closer to the line.
If you want to test a new fertilizer on plants, and you group the plants by "Sunlight exposure" before randomly assigning the fertilizer, what design are you using?
A Randomized Block Design