When there are gaps and clusters in the data, which of the following best depicts/summarizes the distribution?
A. Median and IQR
B. Mean and standard deviation
C. Boxplot
D. Histogram
E. 5-number summary
D
100
IQ scores follow a normal distribution with mean 100 and standard deviation 15. Using the 68-95-99.7 Rule, give an upper bound for the probability that a randomly selected individual has an IQ between 120 and 130.
0.135.
100
What is the sum of the residuals in a regression model?
0
100
Under which of the following conditions, is it preferable to use stratified random sampling?
A. The population can be divided into a large number of strata so that each stratum contains a large number of individuals.
B. The population can be divided into strata where the individuals within each stratum are as similar as possible.
C. The population can be divided into strata where the individuals within each stratum are as different as possible.
B.
100
After surveying 995 adults, 81.5% of whom were over 30, the National Sleep Foundation reported that 36.8% of all adults snored. 32% of the respondents were snorers over the age of 30. What percent of the respondents were under 30 and did not snore?
0.137
200
A private college report contains these statistics: 70% of incoming freshmen attended public schools. 75% of public school students who enroll as freshmen eventually graduate. 90% of other freshmen eventually graduate. Do the data show independence between a freshman’s chances to graduate and the kind of high school the student attended? What percent of freshmen eventually graduate?
No, the variables appear dependent. 0.795
200
What condition must be met in order to use a normal distribution to approximate a binomial distribution?
At least 10 expected successes and 10 expected failures.
200
Which of the following is/are bad news for a regression model? A. An observation having a residual of 50.
B. The scatterplot shows a perfectly linear relationship between x and y.
C. The residuals have non-constant variance.
D. More than one of the above.
C
200
What is the sampling distribution for a sample proportion?
Normal with mean p and standard deviation sqrt(pq/n).
200
Suppose that a polygraph can detect 65% of lies, but incorrectly identifies 15% of true statements as lies. A certain company believes that 95% of its job applicants are trustworthy. The company gives everyone a polygraph test, asking “Have you ever stolen anything from work?” What’s the probability that a job applicant rejected under suspicion of dishonesty was actually trustworthy?
0.814
300
Last year’s data showed that a town’s January high temperatures average 36 degrees F with a standard deviation of 10 degrees, while in July, the mean high tem- perature is 74 degrees with a standard deviation of 8 degrees. In which month is it more unusual to observe a high temperature of 55 degrees?
July
300
A linear regression model has a residual standard error of 0.35. Out of 100 data points used to fit this model, the largest residual is 0.93. Assuming a normal distribution for the errors, how likely are we to observe a residual at least as extreme as 0.93?
0.0078
300
A retailer keeps track of age and annual purchases of its customers. Age has a mean of 29.67 years (SD: 8.51 years) and total yearly purchase has a mean if $572.52 (SD: $253.62). The correlation between the two variables is 0.73. What is the linear regression equation for predicting total yearly purchase from age?
yhat = -72.981 + 21.756 x
300
What theorem or law states that the sampling distribution of any mean approaches a normal distribution as the sample size increases?
The Central Limit Theorem
300
Suppose X and Y are two independent random variables, where X has a mean of 3 and a standard deviation of 5, Y has a mean of 4 and a standard deviation of 2. What is the standard deviation of X-Y?
5.385
400
A dataset has Q1 = 3 and Q3 = 10. A sample of the data points is shown below. Of these, how many outliers are picked out by the boxplot?
0,1,3,5,5,6,6,7,11,15,16
2
400
Suppose that the true proportion of green M&M’s is 0.23. Sketch a normal distribution showing the 68-95-99.7 Rule applied to the proportion of green M&M’s found in a random bag of 100 M&M’s.
Mean is 0.23 and standard deviation is 0.042.
400
Colleges use SAT scores in the admissions process because they believe the scores provide insight into how a high school student performs at the college level. A model is used, for example, to predict college GPA based on SAT scores. As a student, would you rather have a positive or a negative residual?
Positive residual, as it indicates a student who is doing better than expected.
400
f a 90% confidence interval for a proportion is given by (0.3, 0.7), which of the following is true?
A. There is 90% probability that p is between 0.3 and 0.7.
B. There is 90% probability that p-hat is between 0.3 and 0.7.
C. 90% of our confidence intervals produced this way are expected to contain p.
D. 90% of our confidence intervals produced this way are expected to contain p-hat.
C
400
An orchard owner knows that he’ll have to use about 6% of the apples he harvests for cider because they will have bruises. He expects a tree to produce about 300 apples. Describe an appropriate model for the number of cider apples that may come from that tree. Justify your model. Find the probability there will be no more than a dozen cider apples.
0.0721
500
50 students in a class each toss a fair coin 20 times and record the number of heads obtained. Describe the shape, center, and spread that you expect from a histogram of the 50 data points.
Unimodal, roughly symmetric. Mean is 10, standard deviation is 2.236.
500
Suppose the authors of your statistics textbook want to find out what pro- portion of students are buying only the e-book versus also a physical copy. I conduct a survey of my two sections, finding that half of STAT 111-1 (23 students) only own the e-book and that one-quarter of STAT 111-3 (19 students) only own the e-book. What is my best guess for the total proportion of introductory statistics students who just buy the e-book and what is its associated 90% confidence interval? (Use z-star = 1.645.)
Best guess is phat of 0.387 and interval is (0.263, 0.511).
500
A regression model that tries to predict y using x has an R-squared of 0.85. Interpret in words what this 0.85 means.
85% of the variability in y is explained by the model containing x.
500
A survey of 1000 Americans in 2010 found that 40% believe that people are the result of creation rather than evolution. What is a 99% confidence interval for the true proportion of Americans who believe we were created? P(Z > 2.33) = 0.99 and P(Z>2.58) = 0.995.
(0.36, 0.44)
500
While some nonsmokers do not mind being seated in a smoking section of a restaurant, about 60% of customers demand a smoke-free area. A new restaurant with 120 seats is being planned. How many seats should be in the nonsmoking area in order to be 90% sure of having enough seating there?