Graphs
Sampling
Experimental Design
Measures of Center
The cause and Effect
100

Which graph better represents data and why?

What is graph a?  Graph b distorts the proportions exaggerating the difference between data.

100

In a large lecture room class of 300 students, a sample of 10 was taken to determine the male/female make up of the class.  Which misuse of statistics does this represent?

A.  Percentage.

B.  Precise numbers.

C.  Missing data.

D.  Small samples.

What is small sample?

100

Casualty data from the great flu epidemic of 1918 were collected for a study.  This represents what type of study?

A.  Cross-sectional.

B.  Retrospective.

C.  Prospective.

D.  Qualitative

What is retrospective?

100

A sample value that lies very far away from the majority of the other sample values is

A.The center.

B.A distribution.

C.An outlier.

D.A variance.

What is an outlier?

100

Sally concludes from her experiment that going to the gym and doing well on her exam are positively associated. She determines that if she goes to the gym the night before an exam, she will automatically do well and does not need to study for it. Why is this not the correct interpretation of her experiment?

What is Association does not imply causation?

200

Which graph represents the data better and why?

What is Graph a because the values start at 0?

200

At a security checkpoint to a government facility, every 10th individual was more thoroughly searched than the others.  What type of sampling is this?

A.  Systemic.

B.  Convenience.

C.  Stratified.

D.  Cluster.

What is systemic?

200

Explanatory variables are called ______. The value of a factor is called ______.

What is factors, levels

200

Which measure of center is the only one that can be used with data at the nominal level of measurement?

A.Mean

B.Median

Mode

What is mode?

200

Choose the error in the stated conclusion: 

Given:  There is a significant linear correlation between the number of homicides in a town and the number of movie theaters in a town.

Conclusion:  Building more movie theaters will cause the homicide rate to rise.

A.  Correlation implies causality

B.  Data based on averages

C.  Property of linearity

What is Correlation implies causation?

300

This pie chart of the percentage of high-school students who engage in specified dangerous behaviors has a problem.  What is it?

What is the percentages do not add to 100%?

300

United Nations Office on Drugs and Crime reports the statistics on the top 5 nations in the world ranked by numbers of cars stolen in 2000 (bar graph below). What parameters should be considered for a more accurate analysis of the situation?

What is overall population of each country and per capita car thefts?

300

David conducted an experiment where he watched how many students used the cross-walk and how many students crossed. Is this an example of an observational study or a designed experiment?

What is an observational study? 

300

Which of the following measures of center is not affected by outliers?

A. Mean

B. Median

C. Mode

What is median?

300

Anna wanted to know the proportion of UVA students who liked coffee. She wanted to have each grade be equally represented. Therefore, she took a random sample of 100 students from each grade level. Is this an example of block design and if not, why and what is this an example of?

This is not an example of a block design because a block design is a type of design, not a sampling method. Therefore, this is an example of stratified sampling because Anna is taking a random sample within a stratum.

400

The following graph compares people on welfare vs people with full-time jobs.  Why is the graph misleading?

What is the data collection flaw?

Are there really more people on welfare than those who have full time jobs? As Media Matters points out:

“Fox’s 108.6 million figure for the number of “people on welfare” comes from a Census Bureau’s account…of participation in means-tested programs, which include “anyone residing in a household in which one or more people received benefits” in the fourth quarter of 2011, thus including individuals who did not themselves receive government benefits. On the other hand, the “people with a full time job” figure Fox used included only individuals who worked, not individuals residing in a household where at least one person works.”

In other words, if you live with your Mom, Dad, brother Joe and cousin Sam, and Sam was (briefly) on some kind of welfare program, that counted against you and everyone in your household.

400

What is extrapolation and why is it not reasonable to use?

Extrapolation is the use of a regression line for predicting values when the explanatory variable lies far outside the range of the data that was used to determine the line. 

Additionally: Data points farther away from the line are unreasonable to use because it cannot be determined whether or not the line will change further out.

400

Sally was curious about the average number of hours all of the students in Microbiology were spending on homework for the class each week. The professor teaches five sections of Microbiology. One morning, Sally waits outside the classroom and asks the first 40 students who walk into the 9am section how many hours, on average, they spend on homework for the class. What sampling design did Sally use in this experiment and does it include any bias?

What is convenience sampling?

Explanation: Convenience sampling causes undercoverage bias because she is not sampling the students in the other lecture times and she is only getting data from the first students arriving at class, missing students arriving late or who did not attend. 

400

If Bill Gates would walk into the room right now, how would it affect the mean income?

What is it would get too large?

400

Jonathan wants to know what proportion of Calculus students thought the exam was hard. There are four calculus classes and Jonathan collected data from everyone in his calculus class. What is the sampling frame in this scenario and why?

The sampling frame in this scenario is the students in Jonathan’s calculus class 

Explanation: This is because only the students in his class have a chance of being chosen to be in the sample. The sampling frame is not all Calculus students because Jonathan only gathered data from students in his class and there are four different classes.   

500

The following graph describes the cost of attending a 4-year college vs the earnings after the graduation.  Why is this graph misleading?

What is total cost of tuition vs annual income?

500

How do you interpret the slope and intercept of a linear regression line?

  • The slope is the amount of change in the predicted response when the explanatory variable increases by 1 unit.

  • The intercept is the predicted response when the explanatory variable is 0.



500

Jim ran an experiment about the proportion of people who prefer chocolate ice cream over vanilla. To gather data, Jim set up a table on his college campus and people came up and took the survey. Is there any bias present that could prevent Jim from getting a representative sample?

What is voluntary sampling?

The students had to choose to participate in the survey; therefore, Jim is missing a large number of people. This is an example of undercoverage. 




500

Diagnostic test: P(D) is the probability the patient has the disease. P(P) is the patient tests positive. What is the conditional probability for a false positive test and a false negative test?

  • P (Pc/D) False negative 

  • P(P/Dc) False positive

500

Kate flipped a coin and got heads the first time and heads again the second time. What is the probability that she would get heads on her third try?

The probability of her getting heads on her third try is 50% because the events are independent; therefore, the previous outcomes have no effect on her next trial.

M
e
n
u