Hypothesis tests
Confidence intervals
Correlation/regression
Choose the best hypothesis test
100

In class, we mentioned two weaknesses of related T-tests (compared to independent T-tests). One was that related T-tests have to worry about carryover effects. What was the other one?

Related T-tests are sometimes impossible to implement.

100

You suspect that less than 25% of Americans have heard of your favorite podcast. If you wanted to find out whether this is true or not, would you use a hypothesis test or a confidence interval?

Hypothesis test

100

The regression equation BloodPressure-hat = 42.3 + 0.49*Stress uses someone's stress level to predict their blood pressure. In one sentence, interpret the 0.49.

"A one point increase in the stress test is associated with a 0.49 increase in blood pressure."

100

You are a zoologist studying red kangaroos. You are writing an article and want to be able to make the claim that “the average red kangaroo can jump over 4 times as high as the average human can!” Humans can jump about 1.33 ft high on average. So, you need to show that kangaroos can jump higher than 5.32 ft high on average. You get a sample of 41 kangaroos and measure their jump height. The mean is 5.7 ft and the population standard deviation is known to be 1.2 ft.

Z-test for a single mean

200

What was the main drawback of using an independent comparison test instead of a related comparison test?

They must deal with "sampling error": the tendency for different people to be inherently and naturally different. This can make it difficult to ascribe any differences you see between your two groups to the difference in treatments.

200

"If you take many different samples of the same size from the same population and then construct a 95% confidence interval for each of them, about 95% of these will contain mu.” Is this true or false?

True. Think of the simulations we did with all the "green" (good) and "red" (bad) confidence intervals.

200

You calculate Pearson's r and the regression equation y -hat = mx + b for a collection of data. You know that r and m must have the same....

A. Sign

B. Magnitude

C. Sign and magnitude (i.e. they're the same number)

A

200

Do maples or oaks tend to be taller? You take a sample of 40 maples and 40 oaks. The SD for maple heights is 20 ft and for oaks it's 31 ft.

Independent T-test with pooling

300

You want to test and see if your die is rigged. One way to test whether it is rigged is to do six different Z-tests for a single proportion: one to see if the proportion of 1's is too high, another to see if the proportion of 2's is too high, etc. 

Why is this a bad idea? Be specific. I'm looking for a particular vocabulary phrase.

Your probability of making a type I error becomes much higher than alpha. 

300

There are two ways to shrink a confidence interval. What are they?

Lower percent confidence or increase sample size.

300

We find that the correlation between educational level attained and yearly income is r = 0.72. This finding must mean that....

A. higher education causes people to make more money.

B. lower income is associated with higher educational level.

C. people with lower educational levels tend to have lower incomes.

D. people with higher educational levels tend to have lower incomes.

C
300

Studies have shown that ravens and pigeons can sometimes remember faces, especially of people who mistreat them. But is one type of bird more likely than the other to remember someone's face?


Z-test for difference in proportions OR Chi-squared test of independence.


400

Is there a difference in the proportion of iPhone users and android users who are under 20? You get two samples: one of iPhone users and one of android users. You count the number of people under 20 and over 20 in each sample. Run a hypothesis test with alpha = 0.10 to see if the proportion of people under 20 is different for iPhone and android users. Submit your p-value and "plain English" interpretation.

In the iPhone group, 34 were under 20 and 41 were 20+. 

In the android group, 43 were under 20 and 85 were 20+. 

The p-value is 0.09615. There is enough evidence to say that the proportion of iPhone users who are under 20 is different from the proportion of android users who are under 20.

400

I am interested in the proportion of Wellesley students who prefer to study in the library. If I take two different samples of the same size from the same population and then construct a 95% Z-confidence interval for each of them, are the two intervals necessarily the same length? How do you know? Be specific.

Not necessarily! Your interval length is determined, in part, by your SE. Your SE is determined, in part, by your p-hat. So if you get different p-hat's in your samples then the two intervals you make will be of different lengths.

400

A sample of elementary school children is taken. For each one, we note their shoe size and their reading comprehension score. It turns out that these two variables are positively correlated. Clearly an increase in shoe size doesn't cause an increase in reading comp scores. What third, lurking variable is at work here?

Age!

400

Height is important in ultimate frisbee. You decide to do an experiment to see just how important height is. You take a sample of 32 ultimate frisbee teams, calculate the average height on the team, and record the number of wins the team had this season. Does average height help predict a team's success?

Regression