Probability
Experimental Design
Sampling Distributions
LSRL
100

60% of students take a math class after school. 40% of students play a sport. 25% of students do both. A student is selected at random. What is the probability the student takes a math class but does not play a sport?

60% of students take a math class after school. 40% of students play a sport. 25% of students do both. A student is selected at random. What is the probability the student takes a math class but does not play a sport?

P(M only) = P(M) - P(M and S)

0.6 - 0.25 = 0.35

100

Eason wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

a) Identify the experimental units

b) Identify the treatment and control

Eason wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

a) Identify the experimental units

120 AP Stats students in the study

b) Identify the treatment and control

Treatment = memory-training app (4 weeks)

Control = continuing to study normally without app

100

The weights of apples in Phoebe's large orchard are normally distributed with mean = 150 grams and population SD = 20 grams. A random sample of 25 applies is selected.

a) What is the mean & SD of the sampling distribution of the sample mean?

The weights of apples in Phoebe's large orchard are normally distributed with mean = 150 grams and population SD = 20 grams. A random sample of 25 applies is selected.

a) What is the mean & SD of the sampling distribution of the sample mean? Meanx-bar = 150 grams, SDx-bar = 20/sqrt(25) = 4.

100

A regression model predicting college GPA from hours studied per week is: y-hat = 1.85 + 0.12x

Interpret the slope.

A regression model predicting college GPA from hours studied per week is: y-hat = 1.85 + 0.12x

Interpret the slope.

Each additional hour of study is predicted to increase GPA by 0.12 points.

200

At a university, 60% of students own a laptop, 35% own a tablet, and 25% own both. If a randomly selected student owns a laptop, what is the probability they also own a tablet?

At a university, 60% of students own a laptop, 35% own a tablet, and 25% own both. If a randomly selected student owns a laptop, what is the probability they also own a tablet?

P(T|L) = P(T and L) / P(L) = 0.25/0.6 = 5/12 or 0.4167

200

Aaron wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

Explain why random assignment is important to this experiment.

Aaron wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

Explain why random assignment is important to this experiment.

Random assignment reduces the impact of confounding variables, such as prior study habits, motivation, etc. Difference in scores at the end can be attributed to the app rather than pre-existing differences in students.

200

We are working with a sampling distribution of p-hat. Given p = 0.55 and n = 200, check conditions for normal approximation.

We are working with a sampling distribution of p-hat. Given p = 0.55 and n = 200, check conditions for normal approximation.

Need: np > 10, n(1-p) > 10

np=200(0.55) = 110

n(1-p)=200(0.45) = 90


300

Martin answers 90% of his MCQ questions correctly. If Martin answers 6 MCQ questions, what is the probability he'll get 2 questions wrong?

Martin answers 90% of his MCQ questions correctly. If Martin answers 6 MCQ questions, what is the probability he'll get 2 questions wrong?

P(X=2) = (6 choose 2)*(0.1)2(0.9)4 = 0.0984

300

Aaron wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

Explain one potential form of bias (specifically implied in the question).

Aaron wants to test whether a new memory-training app improves students AP Stats scores. He recruits 120 volunteers from an AP Stats course. Students are randomly assigned to either use the memory-training app for 4 weeks or continue studying normally.

Explain one potential form of bias (specifically implied in the question).

Voluntary response bias; students volunteered, therefore may not represent all students from AP Stats courses.

300

The distribution of wait times at a clinic is right-skewed with mean = 40 minutes, SD = 12 minutes. A random sample of 64 patients is selected. 

a) Describe the shape of the sampling distribution of the sample mean

b) Find the probability the sample mean waiting time is less than 37 minutes.

The distribution of wait times at a clinic is right-skewed with mean = 40 minutes, SD = 12 minutes. A random sample of 64 patients is selected. 

a) Describe the shape of the sampling distribution of the sample mean. ~Normal (CLT)

b) Find the probability the sample mean waiting time is less than 37 minutes. P(x-bar < 37) = (37-40)/1.5 = -2. P(z < -2) = 0.0228.

300

A regression model relating temperature (x) to ice cream sales (y) is: y-hat = 12 + 3.5x. One point is discovered to be an outlier with large x-value but small y-value. 

a) How will removing the outlier affect the slope of the LSRL?

b) How will it affect r?

A regression model relating temperature (x) to ice cream sales (y) is: y-hat = 12 + 3.5x. One point is discovered to be an outlier with large x-value but small y-value. 

a) How will removing the outlier affect the slope of the LSRL? slope will increase

b) How will it affect r? r will increase

400

A factory produces electronic chips using two machines. Machine A produces 70% of the chips. Machine B produces 30% of the chips. 2% of Machine A chips are defective, while 5% of Machine B chips are defective. A chip is selected at random. What's the probability it's defective?

A factory produces electronic chips using two machines. Machine A produces 70% of the chips. Machine B produces 30% of the chips. 2% of Machine A chips are defective, while 5% of Machine B chips are defective. A chip is selected at random. What's the probability it's defective?

P(D) = (0.2)(0.7) + (0.05)(0.3) = 0.029
400

Andy wants to research whether a new protein supplement increases muscle gain. He recruits 80 athletes, 40 male and 40 female. He plans to randomly assign athletes to either receive the supplement, or receive the placebo.

a) Explain why gender might be a confounding variable.

b) Explain why blocking might reduce variability in the results.

Andy wants to research whether a new protein supplement increases muscle gain. He recruits 80 athletes, 40 male and 40 female. He plans to randomly assign athletes to either receive the supplement, or receive the placebo.

a) Explain why gender might be a confounding variable. Gender could be related to differences in muscle gain due to biological differences such as hormone levels or baseline muscle mass. If one group has more males than females, the effect of gender could get mixed up with the effect of the supplement.

b) Explain why blocking might reduce variability in the results. Since gender may affect muscle gain, blocking subjects based on gender makes the treatment comparison more precise and reduces unexplained variation.

400

In a large population, 55% of voters support a certain policy. A random sample of 200 voters is selected. Find the probability that more than 60% of the sample supports the policy.

In a large population, 55% of voters support a certain policy. A random sample of 200 voters is selected. Find the probability that more than 60% of the sample supports the policy.

z = (0.6-0.55)/sqrt(0.55*0.45/200) = 1.42

P(p-hat > 0.6) = P(z > 1.42) = 0.0778 or 7.78%

500

Yin Ching is creating his own calculator factory to make sure he never runs out of calculators. Historically, 4% of calculators are defective. He randomly inspects 15 calculators from a batch of many calculators.

a) What is the probability exactly 2 calculators are defective?

b) What is the probability at least 1 calculator is defective?

c) Find the mean and SD of the number of defective calculators in the sample.

Yin Ching is creating his own calculator factory to make sure he never runs out of calculators. Historically, 4% of calculators are defective. He randomly inspects 15 calculators from a batch of many calculators.

a) What is the probability exactly 2 calculators are defective? 0.0988

b) What is the probability at least 1 calculator is defective? 0.458

c) Find the mean and SD of the number of defective calculators in the sample. mean = 15(0.04) = 0.6, SD =  sqrt([np][1-p]) = sqrt(15*0.04*0.96) = 0.76

500

Tommy is testing a new sleep medication. In his study, 200 participants with insomnia are randomly assigned. Half receive the medication, half receive a placebo. Neither the participants nor the researchers interacting with them know which treatment is being used.

a) Identify the type of blinding used. 

b) Describe how a lack of blinding could bias the results.

Tommy is testing a new sleep medication. In his study, 200 participants with insomnia are randomly assigned. Half receive the medication, half receive a placebo. Neither the participants nor the researchers interacting with them know which treatment is being used.

a) Identify the type of blinding used. Double-blind experiment.

b) Describe how a lack of blinding could bias the results. If participants know the treatment, they might report better sleep even if the drug had little impact. If researchers knew who received the medication, they might unintentionally influence participants.

500

Anson's company produces batteries. The lifetime of batteries is normally distributed with mean=500 hours and SD=80 hours. Two independent samples are taken: Sample A: n=25 batteries, Sample B: n=64 batteries. 

b) Which sample mean will have less variability, and why?

c) Explain how increasing sample size affects the sampling distribution of the mean.

Anson's company produces batteries. The lifetime of batteries is normally distributed with mean=500 hours and SD=80 hours. Two independent samples are taken: Sample A: n=25 batteries, Sample B: n=64 batteries. 

b) Which sample mean will have less variability, and why? SDx-bar, A = 80/sqrt(25) = 16. SDx-bar, B = 80/sqrt(64) = 10. B has less variability.

c) Explain how increasing sample size affects the sampling distribution of the mean. For SDx-bar = SD/sqrt(n), increasing sample size (n) reduces variability, making x-bar a more precise estimator of mu.

500

A data set has the following statistics:

x-bar = 12, y-hat = 25, s= 4, sy = 10, r = 0.6.

Write the equation of the LSRL.

A data set has the following statistics:

x-bar = 12, y-hat = 25, s= 4, sy = 10, r = 0.6.

Write the equation of the LSRL.

slope = r*(sy/sx) = 0.6(2.5) = 1.5

y-intercept = y-hat - slope*x-bar

25 - 1.5(12) = 7

y-hat = 7 + 1.5x