AP Stats March 20 (No Formula Sheets)

Quantitative Distributions

Categorical Distributions

Linear Regression

Probability

100

What is the most commonly used definition of an outlier in AP Statistics?

x < Q₁ - 1.5(IQR)

x > Q₃ + 1.5(IQR)

100

Martin was analyzing a 2-way table about boys and girls who like pizza (yes/no). Assume an equal number of boys and girls.

Martin concluded that "More boys like pizza". What should he have said?

Martin was analyzing a 2-way table about boys and girls who like pizza (yes/no).

Martin concluded that "More boys preferred pizza". What should he have said?

A higher percentage of boys preferred pizza than girls.

100

The LSRL for a data set where x = height (cm) and y = weight (kg) is y-hat = 0.7x - 65. A student says that someone who is 180cm tall will weigh 61kg.

Why are they wrong?

The LSRL for a data set where x = height (cm) and y = weight (kg) is y-hat = 0.7x - 65. A student says that someone who is 180cm tall will weigh 61kg.

Why are they wrong?

LSRL shows predicted values; it's not deterministic.

100

Aaron gets an A on 60% of his tests. What is the expected number of tests he will need to take until getting his first A?

E(X) = 1/p = 1/0.6 = 1.67

200

Describe the pros/cons of using:

-Median

-Mean

to describe center.

Describe the pros/cons of using:

-Median Pro: resistant to outliers, Con: less precise

-Mean Pro: more precise measure, Con: sensitive to outliers

to describe center.

200

Eason was analyzing a 2 way table about boys/girls preference between pizza/burgers. He found the following:

Boys - Pizza: 30/50, Burgers 20/50

Girls - Pizza: 10/50, Burgers 40/50

What could Eason say about association in this data?

Eason was analyzing a 2 way table about boys/girls preference between pizza/burgers. He found the following:

Boys - Pizza: 30/50, Burgers 20/50

Girls - Pizza: 10/50, Burgers 40/50

What could Eason say about association in this data?

There is a clear association between gender and food preference because the distribution of food preference is different for boys and girls.

200

When we do inference for LSRL, what is the parameter we're doing inference for?

Slope (Beta)

200

Aaron makes 80% of his free throws. If he takes 10 free throws, what is the probability he makes exactly 7 shots?

P(X=7) = (10 choose 7)(0.8)⁷(0.2)³ = 0.2013

binompdf(n=10, x=7, p=0.8) = 0.2013

300

Describe the pros/cons of using:

-Standard Deviation

-IQR

to describe variability (spread).

Describe the pros/cons of using:

-Standard Deviation Pro: takes value of all data into account; more precise/sensitive measure

Con: outliers can make it misleading

-IQR Pro: resistant to outliers.

Con: doesn't account for change in data values as long as order is the same

to describe variability (spread).

300

Out of 50 boys, 36 prefer sports, 9 prefer music, and 5 prefer art.

Out of 50 girls, 20 prefer sports, 18 prefer music, and 12 prefer art.

A bad AP Stats student (nobody from my class) wrote the following: "72% of boys prefer sports, 40% of girls prefer sports". What are they missing (just for sports)?

Out of 50 boys, 36 prefer sports, 9 prefer music, and 5 prefer art.

Out of 50 girls, 20 prefer sports, 18 prefer music, and 12 prefer art.

A bad AP Stats student (nobody from my class) wrote the following: "72% of boys prefer sports, 40% of girls prefer sports". What are they missing (just for sports)?

Comparative Language

300

Martin asked many different students how many hours they studied the week before a big test and compared it to their scores. He created a scatterplot with his data, and found r = 0.96 and r² = 0.9216. How should he interpret the coefficient of determination?

0.9216 of the variability in test scores for students like these can be explained by variability in hours studied two weeks before a big test.

300

Jerry C comes to class on time about 40% of the time. Starting next week, what is the probability that Wednesday is the first class he comes to on time?

P(X = 3) = (0.6)²(0.4) = 0.144

geometpdf(p = 0.4, x = 3) = 0.144

400

Sometimes precise statistics cannot be determined from graphical displays. For histograms, which of the following can be determined precisely?

Shape, Outliers, Center, Variability

Sometimes precise statistics cannot be determined from graphical displays. For histograms, which of the following can be determined precisely?

Shape* (yes), Outliers* (sometimes), Center (no), Variability (no)

400

Given a set of ordered pairs with s_x = 2.5, s_y = 1.9, r = 0.63, what is the slope of the regression line of y on x?

~0.48

400

Find the mean and standard deviation of the defect rate for 200 items in which 3% of items have defects. Show your work on the board.

mean = np = 200(0.03) = 6

SD = sqrt(200*0.03*0.97) = ~2.41

500

Other than shape, outliers, center, variability, and context...What other special features do we sometimes describe in a distribution of quantitative data?

Peaks, Gaps, Clusters...also approximation of mode (unimodal, bimodal, etc.)