1-sample Confidence Intervals
1-sample Hypothesis Testing
2-sample Confidence Intervals
2-sample Hypothesis Testing
Comparing Counts
100

Find the critical value for a two-sided t-test given the following information:

alpha = 0.05

n = 45 

*Answer with an Excel formula

CV = t.inv(0.025, 44) = +/- 2.015

100

What assumption must we hold when calculating test statistics? Why?

We must assume the null hypothesis is true. If we didn't, the result of our test would tell us nothing about the claimed parameter of interest.

100

A 95% confidence interval for the difference in the average customer satisfaction ratings between two products is (-4.23, 6.22), where the difference is defined as mua – mub. What can we conclude about the difference in customer satisfaction ratings between the two products?

Since the confidence interval includes 0, we cannot definitively conclude that there is a statistically significant difference in average customer satisfaction ratings at the 95% confidence level. The true difference in average satisfaction ratings could be negative (Product B leading), positive (Product A leading), or zero (equal support). 

100

A company implements a new software tool to improve data entry speed. To evaluate its impact, they measure the number of records entered per week before and after using the tool for a sample of 12 employees. The differences (After − Before) have a sample mean of 𝑑_bar = 15.8 records and a sample standard deviation of 𝑠d = 8.9 records. Test whether there is a difference in data entry speed due to the software tool.

Based solely on the test statistic, do you think you will reject or fail to reject the null hypothesis and why?

Test stat = (15.8 – 0) / (8.9 / sqrt(12)) = 6.14

Even without computing the p-value or comparing to the critical value, we can see that our test statistic is so extreme that it is highly likely that we would reject the null hypothesis. 

100

For a χ² distribution with 23 degrees of freedom, find:

A) E(X)

B) Var(X)

A) E(X) = df = 23

B) Var(X) = 2*df = 46

200

95% confidence interval with a sample size of 43 = [145.23, 167.67]

A) What is the point estimate?

B) What is the margin of error?

(167.67 + 145.23) / 2 = 156.45

167.67 – 156.45 = 11.22

200

I claim that the proportion of people who prefer Netflix to AppleTV is 45%. If I calculate a test statistic of z = 1.72 based on my sample, what is the p-value?

*Answer with an Excel formula

2 * norm.dist(-1.72, 0, 1, T) 

200

A researcher compares test scores from two groups:

Group A: n = 30, x_bar = 78, sx_bar = 10

Group B: n = 35, x_bar = 74, sx_bar = 12

Construct a 95% confidence interval for the difference in population means. 

t.inv(0.05, 29) = 2.88 t.inv(0.025, 34) = 2.045 t.inv(0.025, 30) = 1.05

CV = t(0.025, 29) = 2.045 

Standard error = sqrt[ 102 / 30 + 122 / 35 ] = 2.729

CI: (78 – 74) +/- 2.045 * 2.729 = (-1.58, 9.58)

200

A nutritionist wants to compare the average daily protein intake between two groups of adults following difficult diets. Samples produce the following results:

Diet A: n = 15, x_bar = 72, s = 10

Diet B: n = 12, x_bar = 65, s = 8

At alpha = 0.05, test whether there is a difference in mean protein intake between the two diets. 

t.inv(0.05, 12) = 3.11 

t.inv(0.025, 11) = 2.2 

t.inv(0.05, 12) = 0.87

H0: mu1 – mu2 = 0 | Ha: mu1 – mu2 != 0 | df = 11 | CV = 2.2 

Standard Error = sqrt(102 / 15 + 82 / 12) = 3.464

Test statistic = (72 – 65) / 3.464 = 2.02

Since the test statistic (2.02) is less extreme than the critical value (2.2), we fail to reject H0.

200

A candy company claims each bag of candy has the following distribution. Test whether the company’s distribution is a good fit at alpha = 0.05.


Color   |  Expected %  |  Observed

Red           30%                18

Blue          20%                14

Green       50%                 28

Χ²0.05,1 = 3.22           Χ²0.05,2 = 5.99         Χ²0.05,3 = 7.81 


Expected counts: Red (18), Blue(12), Green(30)

Test stat = (18 – 18)2 / 18 + (14 - 12)2 / 12 + (28 – 30)2 / 30 = 0.46

CV = 5.99

Since our test statistic is not as extreme as our critical value, we fail to reject H0.

300

A sample of 50 students from a university has a mean GPA of 3.2 with a standard deviation of 0.5. Of the 50 students sampled, 32% indicated that they were business students. 

Calculate a 95% confidence interval for the population proportion of the university’s students who are business majors.

CV = norm.inv(0.975, 0, 1) = 1.96 

0.32 +/- 1.96 * (sqrt(0.32 * 0.68 / 50)) = (0.19, 0.45)

300

A pharmaceutical company is testing whether a new drug changes the average recovery time from a certain illness. Historically, the mean recovery time is 10 days with a known population standard deviation of 3 days.

A research collects a sample of 36 patients treated with the new drug. They conduct a hypothesis test at alpha = 0.025 to test whether the drug reduces the average recovery time. 

1. State whether this is a one-tailed or two-tailed test and why.

2. Write the Excel formula that would give the critical value for this hypothesis test.

3. Suppose the sample mean is 8.9 days. What is the test statistic?

1. This is a one-sided, left-sided test since we are testing whether the drug reduces the recovery time (i.e., mu < 10)

2. = t.inv(0.025, 35)

3. (8.9 - 10) / (3 / sqrt(36)) = -2.2

300

Two independent random samples are taken from two different populations to estimate the difference in the proportion of people who support a certain policy. The results are as follows:

Sample 1: 200 people, with 120 supporting the policy

Sample 2: 250 people, with 140 supporting the policy

Construct a 95% confidence interval for the difference between population proportions.

p_hat1 = 0.60 | p_hat2 = 0.56 |n1 = 200 | n2 = 250 | CV for 95% CI: 1.96

Standard error = sqrt[ 0.6 * 0.4 / 200 + 0.56 * 0.44 / 250 ] = 0.0467

CI: (0.60 – 0.56) +/- 1.96 * 0.0467 = (-0.05, 0.13)

300

A company tests whether a new training program reduces the time required to complete a task compared to the standard program. Samples produce the following results:

New program: n = 20, x_bar = 18.4, s = 3.5

Standard program: n = 16, x_bar = 21.1, s = 4.2

At alpha = 0.01, test whether the new program leads to lower completion times.

t.inv(0.005, 15) = 1.12 

t.inv(0.005, 16) = 1.34 

t.inv(0.01, 15) = 2.60

Ho: mu1 – mu2 = 0 | Ha: mu1 – mu2 < 0 | df = 15 | CV = 2.60 (one-sided test)

Standard error = sqrt(3.52 / 20 + 4.22 / 16) = 1.31

Test statistic = (18.4 – 21.1) / 1.31 = -2.06

Since our test statistic (-2.06) is not as extreme as our critical value (-2.60), we fail to reject Ho.

300

Test whether the distribution of drink preferences is the same across age groups at alpha = 0.05.

            Soda    |    Juice   |     Water

Teens |   40            20              10

Adults |  30            25              25

Χ²0.05,2 = 5.99           Χ²0.05,1 = 4.61            Χ²0.025,3 = 9.21

Expected Counts = row total*column total / grand total

TS = 32.67, TJ = 21, TW = 16.33, AS = 37.33, AJ = 24, AW = 18.67

Test stat = (40-32.67)2/32.67 + (20-21)2/21 + (10-16.33)2/16.33 + (30-37.33)2/37.33 + (25-24)2/24 + (25-18.67)2/18.67 = 7.77 

Critical Value = 5.99

Since our test statistic (7.77) is more extreme than the critical value (5.99), we reject Ho. There is evidence to suggest that the drink preference distributions differs by age group. 

400

A random sample of 16 students yields an average study time of 12.5 hours per week with a sample standard deviation of 3.1 hours. Construct a 95% confidence interval for the population mean. 

t.inv(0.025, 15) = 2.13 

t.inv(0.05, 15) = 2.44 

t.inv(0.05, 16) = 1.13

n = 16 | x_bar = 12.5 | sx_bar = 3.1 | alpha = 0.05 | CV = t.inv(0.025, 15) = 2.13

Standard error = 3.1 / sqrt(16) = 0.775

CI: 12.5 +/- 2.13 * 0.775 = (10.85, 14.15)

With 95% confidence, the population mean study time falls within 10.85 hours and 14.15 hours. 

400

A researcher believes that 75% of individuals in a city support a certain policy. After surveying 250 people, 178 support the policy. Based on a comparison of the test statistic and the critical value (with alpha = 0.01), what conclusion would you make? 

Norm.inv(0.01, 0, 1) = 1.678

Norm.inv(0.005, 0, 1) = 2.576

Norm.inv(0.99, 0, 1) = 1.98

Two-sided CV: alpha / 2 = 0.005 --> = +/- norm.inv(0.005, 0, 1) = +/- 2.576

p0 = .75 | p_hat = 178 / 250 = 0.712 | std dev = sqrt[(0.75*0.25)/250] = 0.027

Test stat = (0.712 – 0.75) / 0.027 = -1.407

Since the test stat is not “beyond” the CV, we would fail to reject H0.

400

A survey compares the proportion of people who prefer two different brands.

Brand X: 56 out of 100 people prefer it 

Brand Y: 72 out of 120 people prefer it 

At the 𝛼 = 0.05 level, test whether the proportions differ.

P1 = 0.56   P2 = 0.60

Test stat = (0.56 – 0.60) / sqrt(0.56*0.44/100 + 0.60*0.40/120) = -0.60

Critical value = +/-1.96

Since our test stat is less extreme than our critical value, we fail to reject Ho. 

400

Test whether the two variables are independent at alpha = 0.05.

                    Good  |    Poor

Exercise          45           15

No Exercise     25           25

Χ²0.05,1 = 3.84             Χ²0.025,2 = 2.71              Χ²0.05,4 = 6.63

Expected Counts: EG = 38.18, EP = 21.82, NoExG = 31.82, NoExP = 18.18

Test Stat = 7.37

Critical value = 3.84

Since the test stat (7.37) is more extreme than the critical value (3.84), we reject Ho. There is evidence that the two variable (exercise and sleep quality) are associated.

500

From earlier polling, the CW believes that 34.8% of viewers watch Flash in their television lineup. On March 28th, 2016, a crossover episode of Flash with Supergirl aired on CBS reaching a different audience. The CW conducts a random survey of 350 viewers and finds that 137 viewers watch Flash. 

Calculate the test statistic. 

H_o: p = .348

H_a: p != .348

p_hat = 137 / 350 = .3914

Sample std dev = sqrt[(.348 * (1 - .348) / 350)] = .0255 

Test stat = (.3914 - .348) / .0255 = 1.702

M
e
n
u