What is the conditional probability formula?
P(A|B) = P(A and B) / P(B)
What is the definition of a 95% confidence interval for mean?
If multiple samples are taken, 95% of the confidence interval will contain the true mean of the population.
Explain the difference between r and r^2.
r is the measure of the strength of the correlation between two variables, whereas r^2 is the proportion of variability in the response data explained by the predictor variable.
What's the difference between categorical and quantitative data?
Categorical variables are counts from a group while quantitative variables are data that are represented by a number.
When do you use t-test compare to z-test?
t-test is used when given the mean of the sample and a z-test is used when given the proportion of the sample.
If Kaiden flips a coin 3 times, what are the odds that the coin will be heads all three times?
(0.5)3 = 0.125
What happens to the confidence interval when you increase the sample size?
A larger sample size will result in a tighter confidence interval with a smaller margin of error.
What does it mean to have a positive residual?
The observed value is higher than the expected value.
Bias can be reduced by sample size. True or False?
False
If the alpha value is 0.05 and the p-value is 0.037, what is the conclusion?
According to the p-value of 0.037, we can reject the null hypothesis at a significance level of 0.05. There is enough evidence to support the alternative hypothesis.
The probability that a randomly chosen dog from a shelter is a golden retriever is 0.40. What is the probability in a sample of 15, that 2 of them would be a golden retriever?
binompdf(15,0.4,2) = 0.0219
What is the t statistic for the 90% confidence interval of data with a sample size of 37?
t*37-1 = invT(1.9/2,36) = 1.688
Based on the following statistic, which data has a stronger correlation?
Data 1: r=0.6300
Data 2: r^2=0.4624
Data 2
Mr. A teaches statistics. He has 2 forms of a midterm exam, and he wonders if either form is harder than the other. He teaches approximately 200 total students in 4 sections of the class. He randomly assigns half of the students in each section to each form of the exam. Mr. A will then see if the average scores are significantly different between each form. What type of experiment design is this?
A randomized block design where the 4 sections are the blocks
How does increasing the sample size 4 times affect the z statistic?
z statistic decreases by half
Babi uses SRS to conduct a survey and ask AP students what they scored on their exams. He knows that 8% of the students get a 5. What is the probability that the first 5 does not occur within the first 6 students he surveys?
1-geometcdf(.08,6)=0.606
The following computer output represents the regression analysis on the relationship between the number of hours studied and the grade earned (%) on an exam for 22 students. Assume all conditions for inference are met. Construct and interpret a 95% confidence interval for the slope of the least-squares regression line.
Variable Coef S.E. Coeff t p
Constant 61.42 3.55 17.32 0.00
Hours 4.77 0.46 10.47 0.00
s=7.234 R-sq=62.4%
t*=2.086
4.77+2.086(0.46)=5.73
4.77-2.086(0.46)=3.81
(3.81,5.73)
We are 95% confident that the grade earned on the exam increased by an expected percentage between 3.81% and 5.73% for each additional hour the student studied for the exam.
The correlation between the number of seniors that skipped school on skip day and their grades is -0.60.
The correlation between the number of seniors that skipped school on skip day and their phone usage is 0.75.
Which of the following statements is true?
i. As the number of screen time increases, the number of seniors skipping school decreases.
ii. Skipping school causes grades to drop.
iii. The correlation between seniors and their phone usage is stronger than the correlation between seniors and their grades.
only iii
What type of bias is this?
Mr. A conducts an end-of-the-year survey and he asks, "In what ways do I need to improve as a teacher?"
Response bias
In this example, describe what would happen if a type 2 error was committed.
Research Hypothesis: Aidan is better at AP stats mcq than Kaiden.
Null Hypothesis: Aidan is no better at AP stats mcq than Kaiden.
If a Type II error is committed, then Aidan is assumed to be no better when he, in fact, is better at the mcq (the null hypothesis should be rejected, but it is accepted). People may not ask Aidan to tutor them, although they would be better off being tutored by Aidan than by Kaiden.
Mr. A believes that the number of students in 1st block who did well on the chi-squared test is normally distributed with a mean of 80 and a standard deviation of 10. He also believes that the number of students in 3rd block who did well on the chi-squared test is independent of the chi-squared test from 1st block and is normally distributed with a mean of 83 and a standard deviation of 7.
What is the probability that on any given day, the students in his 3rd block will do better than students in his 1st block on the chi-squared test?
normalcdf(0,1E99,3,12.2)= 0.597
Both Dover High School and Oyster River are known to have good track teams, however, a teacher suspects that Dover's track team outperforms Oyster River. In a study of volunteers from the track team, the 95% confidence interval estimate of the difference in the mean time to complete the mile was (2,5). What is a reasonable conclusion?
The conditions to conduct a confidence interval are not met due to voluntary response, therefore a conclusion would be invalid.
In the context of regression analysis, which of the following statements are true?
i. When the sum of the residuals is zero, the model is nonlinear.
ii. Outliers do not affect the coefficient of determination.
iii. Influential points will increase the correlation coefficients.
None of the statements are true
Samriddha is redesigning his company's website. He wants to carry out an experiment that involves randomly assigning visitors to either the existing design or the new design. A user's experience is slightly different depending on what type of device (mobile or desktop) they use, so Samriddha wants to use a randomized block design. How should Samriddha form the blocks?
Users within each block should use the same type of device as each other.
Do more seniors take AP calculus over AP statistics? A two-sample t-test of the hypotheses, Ho: Mcalc = Mstats vs HA: Mcalc>Mstats results in a P-value of 0.02.
Which of the following statements must be true?
i. A 90% confidence interval for the difference in means contains 0.
ii. A 95% confidence interval for the difference in the means contains 0.
iii. A 99% confidence interval of the difference in means contains 0.
only iii