One-Variable Statistics
Two-Variable Statistics
Collecting Data
Probability
Random
100

Which of the following values can be negative?

A) The IQR

B) The Correlation Coefficient

C) The Sample Vairance

D) The Coefficient of Determination

B) The Correlation Coefficient

100

If the correlation coefficient between the two components of a bivariate dataset is exactly 0.56, then what is the sum of all possible values of the coefficient of determination calculated from this bivariate dataset?

A) 0

B) 0.3136

C) 0.56

D) 0.7483

E) None of the Above

B) 0.3136

100

Veronica wants to know about the general public's rating of her new type of gum. So she goes to a local supermarket (where people typically purchase gum) and convinces the first 250 people she sees to try a piece of gum (chosen by her at random): either her new gum or the leading competitor's gum. Once they are given one piece of gum and asked to chew it for one minute, they are asked to spit it out. The subjects are then told to report which piece of gum they prefer. Both pieces of gum look identical both before and after being chewed. 


What type of sampling methodology did Veronica use for her study?

A) Convenience Sample

B) Systematic Random Sample

C) Cluster Random Sample

D) Stratified Random Sample

E) None of the Above

A) Convenience Sample

100

Which theorem below is the supporting theorem for the notion that as the size of a random sample from some population approaches infinity, the mean of the sample will approach the population mean?

A) The Law of Large Numbers

B) The Central Limit Theorem

C) The Empirical Rule

D) Bayes Theorem

E) None of the above

A) The Law of Large Numbers

200

Use the following dataset:

{25, 48, 29, 35 79, 53, 58, 19, 8, 11, 21, 45, 56, 78, 34, 46, 18, 39, 60, 72}

Find the sum of all five numbers in the 5-number summary.

Answer: _________

209

200

A national consumer magazine reported the following correlations.

  • The correlation between car weight and car reliability is -0.30.
  • The correlation between car weight and annual maintenance cost is 0.20.

Which of the following statements are true?

I. Heavier cars tend to be less reliable.
II. Heavier cars tend to cost more to maintain.
III. Car weight is related more strongly to reliability than to maintenance cost.

 (A) I only
 (B) II only
 (C) III only
 (D) I and II
 (E) I, II, and III

(E) I, II, and III

200

Which of the following is the minimum sample size required to estimate an unknown population proportion with 95% confidence and a margin of error of 3.1% when no prior estimate for the proportion is known?

A) 999

B) 1000

C) 1067

D) 1068

B) 1000

300

This type of graph is used to display the conditional distribution of a categorical response variable for each value (or level) of a categorical explanatory variable. 

Segmented (or Stacked) Bar Graph
300

If the correlation coefficient between the two components of the bivariate data set is exactly 0.56, then what is the sum of all possible values of the coefficient of determination calculated from the bivariate dataset?

A) 0

B) 0.3136

C) 0.56

D) 0.7483

E) None of the above

B) 0.3136

300

Which of the following (are) important aspects (and in some cases, absolute requirements) of a well-designed experiment?

   1. Random assignment of subjects or experimental units to at least two treatment groups

   2. Using a large number of subjects or experimental units to help reduce sampling variability

   3. Using a placebo and double-blinding technique if an when they are possible and called for. 

   4. Controlling as many confounding variables as possible that can influence a response variable as much as possible.

A) 1, 2, and 4

B) 1, 2, and 3

C) 2, 3, and 4

D) 1, 3, and 4

E) None of the Above


E) None of the above

300

Suppose Harrison Butker is practicing his field goal kicking and he has a constant rate of 89.1% of successfully making a field goal on any independent field goal kick attempt.

Harrison continues to attempt field goals until he successfully makes one. What is the sum of the mean and the standard deviation of the random variable that appropriately models this situation? Round your answer to 3 decimal places.

A) 1.742

B) 1.493

C) 1.260

D) 1.245

E) None of the above

B) 1.493

300

Which of the following statements are true?

I. The center of a confidence interval is a population parameter.
II. The bigger the margin of error, the smaller the confidence interval.
III. The confidence interval is a type of point estimate.
IV. A population mean is an example of a point estimate.

(A) I only
 (B) II only
 (C) III only
 (D) IV only
 (E) None of the above.

E) None of the above

400

Which of the following graphical displays would best convey an association between the two categorical variables in terms of displaying the conditional distribution of the response variable (Favorite Professional Sports Team) for each level of the explanatory variable (gender: Male or Female) in addition to also showing the proportional breakdown between each level of of the explanatory variable (Gender) which is achieved by making the width of each bar in the graph proportional to the relative frequency of the category of the explanatory variable heat is being represented by that bar?

A) A mosaic plot

B) A side-by-side bar graph

C) A stacked (or segmented) bar graph

D) A histogram

A) A mosaic plot

400

A polling firm wants to contact a random sample of people likely to vote in an upcoming nationwide election. They will use a random digit dialer to generate and call phone numbers at random, so the poll will include people with landlines, unlisted numbers, and mobile phones.

The random digit dialer skips invalid phone numbers. If a person doesn't answer a call, the dialer will try one more time, and then skip that number. When a person does answer, a pollster asks a set of initial questions—such as the person's age and whether or not they voted in previous elections—to see if they are likely to vote in the upcoming election. If they are eligible and likely to vote, the pollster will then ask a series of questions about the election.

Which of the following is not a potential source of bias in their poll? 

(A) Some people might refuse to share their personal information over the phone.


(B) People who are likely to vote but don't have a telephone are excluded from the poll.


(C) Some people will not answer calls from an unfamiliar caller.


(D) The poll excludes people too young to vote.

 

(E) Some people might say they are registered and suggest they are likely to vote when they aren't.


(F) None of the above

(D) The poll excludes people too young to vote.

400

Sam is taking his AP Statistics test and gets confused between the characteristics of a geometric distribution versus a binomial distribution. Each statement below is assigned a value in the parentheses next to the statement. Find the sum of the statement values that correspond to the statements that describe features of a geometric distribution.

 - There are one or more Bernoulli trials with all failures and a trial usually ending in a success (-10)

- Each trial is independent (2)

- There are two possible outcomes on each trial (3)

- There is a fixed number of trials (-6)

- The distribution is skewed to the right (-1)

- The probability of success and failure does not change from one Bernoulli trial to the next (0)


A) -3

B) -11

C) -1

D) 1

E) None of the Above


E) None of the Above

The correct answer is -6

-10 + 2 + 3 + (-1) + 0 = -6

400

Which of the following statements about statistical hypothesis testing is true?

l. One should always decide upon a level of significance (a) after observing the p-value of the test

ll. If we were to observe a p-value of 0.01 at a level of significance of a=0.05, we can definitively conclude that the null hypothesis is absolutely false. 

lll. The p-value measures the likelihood that the null hypothesis is true given the data used in the test.

A: l and ll only

B: ll and lll only

C: l and lll only

D: All three statements are true

E: None of the above

E: None of the above


Note: A p-value in statistical hypothesis testing represents the probability of observing data as extreme or more extreme than what was actually observed, assuming the null hypothesis is true.



500

Suppose Mr. Rooze gave an Algebra 2 test to all of his Algebra 2 honors students, but he is dissatisfied with the distribution of the scores. So he decides to "curve"  them via a linear transformation in the form of multiplying each student's original score by 0.8 and then adding 20 points to that result. Then for fun, Mr. Rooze decides to randomly select one of his ALgebra 2 students and determines the z-score of the student's score after the linear transformation is applied to the student's original score. Suppose we let Zrepresent the original z-score of the student's score and Zrepresent the z-score of the student's score after the linear transformation is applied to the student's original score. Which of the following describes the correct relationship between Zand Z1?

A: Z= 0.8(Z0) + 20

B: Z= Z+ 20

C: Z= Z0

D: Z= 0.8(Z0)

C: Z= Z0

500

Mrs. Russell collects data on her students' number of hours they spend on TikTok per day and would like to see if it can reliably help predict their Midterm exam grades. There was a data point that Mrs. Russell believed was submitted as a joke. A student reported that they spend an average of 23.5 hours per day on TikTok. This was 17.5 hours higher than any other student's response to that question. Once removed from the bivariate data set (along with its associated Midterm Exam score) both the correlation coefficient and the slope of the least squares regression line changed significantly. In general, we would say such a point in an LSRL that has such great influence or "pull" on the regression line is typically considered a high ______ point.

What is the missing word

"leverage"

500

Since gender and body weight are confounding factors in determining a person's blood alcohol concentration, or BAC, after consuming a given number of alcoholic beverages within one hour suppose we wish to conduct a randomized, block-design experiment on a set of 4000 volunteer subjects, exactly half of them are male, and half of whom are female, and we additionally classify the 2000 subjects within each gender block by weight according to the quartile within which their weight is located after we compute the five-number summary of the wights for each subset of 2000 male or 2000 female subjects. We then measure the BAC of each subject within each gender and weight-class block after having each of them consume either 1, 2, 3, 4, or 5 alcoholic beverages within an hour (which is assigned at random to each subject within each gender and weight-class block) to determine the mean BAC within each gender and weight-class block for each given level of the treatment: namely the number of alcoholic beverages consumed within the hour. The results are then analyzed and compared. 

This experiment utilizes two blocking factors and one experimental factor. How many levels does the one experimental factor have?

5

500

Roshini loves trigonometry and proving trigonometric identities in particular. On any given attempted trigonometric identity proof, Roshini has an 85% chance of getting to the correct answer on her first try. You may assume that each trigonometric identity proof she attempts is independent of all others. 

Let random variable X represent the number of trigonometric identity proofs where Roshini gets to the correct answer on her first try in a set of 12 independent and randomly selected trigonometric identity proofs. What is the sum of the mean and the standard deviation of random variable X? Round to 4 decimal places.

11.4369

500

Suppose a UC Berkley professor randomly sampled 180 students from the large population of all UC Berkeley students to construct a 95% confidence interval for μ, the true population mean amount of time spent on homework each night by all UC Berkeley students. Suppose the interval the professor calculated is 2.3 hours < μ < 4.5 hours. If the interval was instead created using a 99% level of confidence, how would it have changed in comparison to the original 95% confidence interval? You may assume all assumptions and conditions for constructing both intervals are met and that all other values used to construct each interval remain the same; thus, only the level of confidence is changed.

A: The new 99% confidence interval would be wider than the original 95% confidence interval

B: The new 99% confidence interval would be narrower than the original 95% confidence interval.

C: The new 99% confidence interval would be the same width as the original 95% confidence interval.

D: We cannot determine how the new confidence interval will change in comparison to the original confidence interval without knowing more information

E: None of the above

  A: The new 99% confidence interval would be wider than the original 95% confidence interval.

M
e
n
u