Ch. 1-2
Ch. 3-4
Ch. 5-6
Ch. 7-8
Ch. 9
100

The science and art of collecting, analyzing, and drawing conclusions from data

The science and art of collecting, analyzing, and drawing conclusions from data

Statistics

100

In a __________ experiment, either the subjects don't know which treatment they are receiving or the people who interact with them and measure the response variable don't know which subjects are receiving the treatment.

In a single-blind experiment, either the subjects don't know which treatment they are receiving or the people who interact with them and measure the response variable don't know which subjects are receiving the treatment.

100

The long-run relative frequency of an outcome after many repetitions of a chance process is its ___________

The long-run relative frequency of an outcome after many repetitions of a chance process is its probability.

100

When constructing confidence intervals, the _________ is a statistic that provides an estimate of a population parameter.

When constructing confidence intervals, the point estimator is a statistic that provides an estimate of a population parameter.

100

_______ _______  _______  ________ 4 steps makes inference much faster.

State Plan Do Conclude 4 steps makes inference much faster.


200

A ___________ assigns labels that place each individual into a particular group, called a category. A ___________ takes number values that are quantities— counts or measurements

A categorical variable assigns labels that place each individual into a particular group, called a category. A quantitative variable takes number values that are quantities— counts or measurements

200

In an experiment, a _________ is used to provide a baseline for comparing the effects of other treatments.

In an experiment, a control group is used to provide a baseline for comparing the effects of other treatments.

200

What is the addition rule for mutually exclusive events?

P(A or B) = P(A) + P(B)

200

A ________ is a multiplier that makes the interval wide enough to have the stated capture rate.

A critical value is a multiplier that makes the interval wide enough to have the stated capture rate.

200

"Because the p-value of 0.007 is less than the significance level of 0.05, we have convincing evidence that the true proportion of students from Basis who get a 5 on their AP test is greater than 0.8."

Assume the above statement is correct. Explain why it's incomplete.

"Because the p-value of 0.007 is less than the significance level of 0.05, we have convincing evidence that the true proportion of students from Basis who get a 5 on their AP test is greater than 0.8."

Assume the above statement is correct. Explain why it's incomplete.

Doesn't explicitly say we are rejecting the null hypothesis.

300

The _______ of a variable tells us what values the variable takes and how often it takes those values.

The distribution of a variable tells us what values the variable takes and how often it takes those values.

300

Explain the concept of confounding and how it limits the ability to make cause-and-effect conclusions.

Explain the concept of confounding and how it limits the ability to make cause-and-effect conclusions.

A confounding variable is a hidden/unaccounted-for variable that influences both explanatory and response variables.

Because it influences both variables, it creates a misleading relationship between them.

300

What is the general addition rule?

P(A or B) + P(A) + P(B) - P(A and B)

300

If we want to reduce our margin of error, we must _________________ or _______________

If we want to reduce our margin of error, we must decrease confidence level or increase sample size.

300

What is the difference between calculation of the Standard Error for confidence intervals for proportions and significance tests for proportions?

What is the difference between calculation of the Standard Error for confidence intervals for proportions and significance tests for proportions?

CI: use p-hat

significance test: use null p

400

If knowing the value of one variable helps us predict the value of another variable, we say there is ________ between the two variables.

If knowing the value of one variable helps us predict the value of another variable, we say there is (an) association between the two variables.

400

Florence felt sad there was no Statistics homework on Thanksgiving, so she decided to collect data on the number of hours students sleep the night before an exam and their scores on that exam. Her data resulted in a LSRL with a slope of 3.8 and y-intercept of 52.4. Interpret the slope and y-intercept in context.

Florence felt sad there was no Statistics homework on Thanksgiving, so she decided to collect data on the number of hours students sleep the night before an exam and their scores on that exam. Her data resulted in a LSRL with a slope of 3.8 and y-intercept of 52.4. Interpret the slope and y-intercept in context.

slope: For each additional hour of sleep, the model predicts the exam score will increase by 3.8 points on average

y-intercept: A student who studies 0 hours is predicted to get a score of 52.4 on average.

400

What conditions must be met before we treat something as a binomial random variable?

What conditions must be met before we treat something as a binomial random variable?

Binary (2 outcomes)

Independent (trial outcomes don't effect each other)

Number of trials fixed

Same probability each trial

400

Phoebe noticed one of her classmates practicing interpreting confidence intervals. Her classmate constructed a 95% confidence interval for the mean time (in minutes) it takes students to complete a puzzle, which was calculated to be (12.4, 15.8). They interpreted this as "There is a 95% chance that the true mean time is between 12.4 and 15.8 minutes". Phoebe warned them that this is incorrect. Why?

Phoebe noticed one of her classmates practicing interpreting confidence intervals. Her classmate constructed a 95% confidence interval for the mean time (in minutes) it takes students to complete a puzzle, which was calculated to be (12.4, 15.8). They interpreted this as "There is a 95% chance that the true mean time is between 12.4 and 15.8 minutes". Phoebe warned them that this is incorrect. Why?

The true population mean is a fixed value - it's either in the interval or it's not. The probability is not about the parameter, it's about the method used to generate the interval.

400

Before going home, Diego gives the following warning to all his classmates: "Beware of multiple analyses!"

Why did he do this?

Before going home, Diego gives the following warning to all his classmates: "Beware of multiple analyses!"

Why did he do this?

Performing the same significance test multiple times results in a higher chance of getting a false positive by random chance.
500

During the holiday, Tommy became very curious about the study habits of his classmates. He surveyed 150 students in the school about whether they preferred studying alone or with others. He also asked them if they consider themselves introverted or extroverted.

                     Study Alone | Study with Others| Total

Introverted         54                 21                     75

Extroverted        18                 57                      75

Total                  72                 78                    150


Is there association between personality type and study preference? Justify your answer.

During the holiday, Tommy became very curious about the study habits of his classmates. He surveyed 150 students in the school about whether they preferred studying alone or with others. He also asked them if they consider themselves introverted or extroverted.

                     Study Alone | Study with Others| Total

Introverted         54                 21                     75

Extroverted        18                 57                      75

Total                  72                 78                    150

Is there association between personality type and study preference? Justify your answer.

Yes, strong association (support your answer comparing any conditional probabilities)

500

Andy felt determined to practice his Statistics skills outside of class by modeling the relationship between the mass of an animal (kg) and its metabolic rate (watts). His model predicted the metabolic rate from mass, with slope of 6.45, y-intercept of 18.2, s of 12.7, and r2 of 0.82.

a) write the equation of the LSRL for Andy's model

b) interpret r2

Andy felt determined to practice his Statistics skills outside of class by modeling the relationship between the mass of an animal (kg) and its metabolic rate (watts). His model predicted the metabolic rate from mass, with slope of 6.45, y-intercept of 18.2, s of 12.7, and r2 of 0.82.

a) write the equation of the LSRL

y-hat = 18.2 + 6.45x, where y-hat = predicted metabolic rate and x = mass (in kg)

b) interpret r2

About 82% of the variation in metabolic rate among these species is explained by the linear relationship with mass.

500

Jerry is using his knowledge of statistics and business to start a light bulb company. 40% of his light bulbs last more than 1,000 hours. His company is new, so they've only produced 200 light bulbs. Yinching wants to take a sample of 30 randomly selected light bulbs and model the number of long-lasting bulbs in the sample as a binomial random variable. But Jerry warns Yinching that this is not a good use of statistics. Why?

Jerry is using his knowledge of statistics and business to start a light bulb company. 40% of his light bulbs last more than 1,000 hours. His company is new, so they've only produced 200 light bulbs. Yinching wants to take a sample of 30 randomly selected light bulbs and model the number of long-lasting bulbs in the sample as a binomial random variable. But Jerry warns Yinching that this is not a good use of statistics. Why?

Sampling without replacement where n (30) is not less than or equal to 1/10 N (200) violates the Independent condition from BINS.

500

Anson decides to poll students at Basis to see how many believe AP Statistics should be a required class for graduating. He wants to create a 95% confidence interval with a margin of error no bigger than 4%. What is the minimum sample size Anson needs to achieve his goal?

Anson decides to poll students at Basis to see how many believe AP Statistics should be a required class for graduating. He wants to create a 95% confidence interval with a margin of error no bigger than 4%. What is the minimum sample size Anson needs to achieve his goal?

0.04 = 1.96*sqrt([0.5*0.5] / n)

0.0016 = 3.8416(0.25/n)

n = (3.8416 * 0.25) / 0.0016

n = 600.25 -> round up -> n = 601 students

500

Karina is creating a company designed to monitor students' biometric signs so that their teachers always know exactly how much time they spent studying.

Before the company launches, her null hypothesis is that mean time (mins) studying Statistics per night equals 60, and the alternative hypothesis is that the mean time is less than 60. She chooses a significance level of 0.05. 

a) Describe a Type I error in context

b) Describe a Type II error in context

Karina is creating a company designed to monitor students' biometric signs so that their teachers always know exactly how much time they spent studying.

Before the company launches, her null hypothesis is that mean time (mins) studying Statistics per night equals 60, and the alternative hypothesis is that the mean time is less than 60. She chooses a significance level of 0.05. 

a) Describe a Type I error in context

Concluding mean study time for Stats is less than 60 mins/night when it's actually 60 minutes

b) Describe a Type II error in context

Concluding mean study time for Stats is 60 minutes when it's actually less than 60 mins.

M
e
n
u