4.1a
4.1b
4.2a
4.2b
4.3a
4.3b
100

For the following situations involving sampling, identify—as precisely as possible—the population that the sample represents.

A business school researcher wants to know what factors affect the survival and success of small businesses. She selects a sample of 150 eating-and-drinking establishments from those listed in the telephone directory for a large city.

Population is all small businesses.

100

For the following situations involving sampling, identify—as precisely as possible—the population that the sample represents.

An insurance company wants to monitor the quality of its procedures for handling loss claims from its auto insurance policyholders. Each month the company selects an SRS of all auto insurance claims filed that month to examine them for accuracy and promptness.

Population is all auto claims filed in a given month for this insurance company

100

A medical study of heart surgery investigates the effect of a drug called a beta-blocker on the pulse rate of the patient during surgery. The pulse rate will be measured at a specific point during the operation. The investigators will use 20 patients facing heart surgery as subjects. You have a list of these patients, numbered 1 to 20, in alphabetical order.

Describe the design of a completely randomized, controlled experiment to test the effect of beta-blockers on pulse rate during surgery.

Use a random number table to choose ten 2-digit numbers from 01 to 20, ignoring repeats. The patients with these numbers will receive the beta-blocker during their operation. The remaining 10 people will act as a control group and will not receive the beta blocker. Measure pulse rate of all patients at the specified point in the operation, and compare the difference in mean pulse rate for the two groups.

100

Agricultural scientists for a chemical company want to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study, they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24.

Describe the design of a completely randomized, controlled experiment to test the whether the new fertilizer produces heavier tomatoes.

Use a random number table to choose twelve 2-digit numbers from 01 to 24, ignoring repeats. The tomato plants with these numbers will receive the new fertilizer. The remaining 12 plants will act as a control group and will receive the old fertilizer. Measure the total weight of tomatoes produced by each plant, and compare the mean weight in the two groups.

100

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

A football coach thinks lessons in yoga will improve the flexibility of his players and thereby reduce injuries. To test his theory, he randomly divides the players on the team into two groups. One group has 45 minutes of yoga training each day. The players in the other group do the standard stretching routine the team has used in the past. He compares flexibility in the two groups at the end of the experiment.

Random assignment-->cause and effect can be inferred. No random sampling-->Cannot generalize beyond the subjects of the study.

100

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

A teacher wants to decide if handing out a topic outline before the final exam improves the exam scores of calculus students. He has two sections of calculus this term. His gives one class a topic outline and tell the other class to generate their own topic outline. He then compares the two sections’ scores on the final exam.

No random assignment-->cause and effect cannot be inferred. No random sampling-->Cannot generalize beyond the subjects of the study.

200

A local radio talk-show host asks viewers to call in and vote for or against a proposed plan to raise the prices charged by municipal parking meters in a downtown shopping district. 75% of the respondents are opposed to the increase. Describe one possible source of error or bias that might arise in this poll and indicate the direction in which the estimate might be biased. What is the name for this kind of bias?

Only those listeners with strong opinions are likely to call in. The poll probably overestimates opposition to the increase. This is bias arising from voluntary response.

200

In late 1995, a Gallup survey reported that about 46% Americans approved of sending troops to Bosnia. The poll did not mention that 20,000 U.S. troops were committed to go. A CBS News poll mentioned the 20,000 figure and got a different outcome—an approval rate of only 33%. Briefly explain why the mention of the number of troops would cause such a big difference in the poll results. Write the name of the kind of bias that is at work here.

This is bias arising from the wording of a question. Knowledge of how many troops were going to be deployed increased people’s concerns about troop safety.

200

A medical study of heart surgery investigates the effect of a drug called a beta-blocker on the pulse rate of the patient during surgery. The pulse rate will be measured at a specific point during the operation. The investigators will use 20 patients facing heart surgery as subjects. You have a list of these patients, numbered 1 to 20, in alphabetical order.

Use the section from the random digits table below to carry out the randomization required by your design and list the outcome of the randomization.

96746 12149 37823 71868 18442 35119 62103 39244

96927 19931 36809 74192 77567 88741 48409 41903

43909 99477 25330 64359 40085 16925 85117 36071

15689 14227 06565 14374 13352 49367 81982 87209

36759 58984 68288 22913 18638 54303 00795 08727

The first 10 patient numbers are 18, 19, 10, 03, 06, 08, 11, 15, 13, 09. These patients will constitute the treatment group. The remaining 10 patients will be in the control group.

200

Agricultural scientists for a chemical company want to determine if a newly developed fertilizer produces heavier tomatoes than the fertilizer they currently manufacture. For their first pilot study, they have 24 healthy young tomato plants growing in individual pots, numbered from 1 to 24.

Use the section from the random digits table below to carry out the randomization required by your design and list the outcome of the randomization.

27816 78416 18329 21337 35213 37741 04312 68508

66925 55658 39100 78458 11206 19876 87151 31260

08421 44753 77377 28744 75592 08563 79140 92454

53645 66812 61421 47836 12609 15373 98481 14592

66831 68908 40772 21558 47781 33586 79177 06928

The first 12 plant numbers are 16, 18, 13, 21, 04, 08, 10, 07, 11, 20, 15, 12. These tomato plants will constitute the treatment group. The remaining 12 plants will be in the control group.

200

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

Does lack of sleep affect your academic performance? A student explores this question by asking everyone in his statistics class to write down on a piece of paper his or her score on a recent test and total number of hours of sleep he or she got on the last three nights before taking the test.

No random assignment-->cause and effect cannot be inferred. No random sampling-->Cannot generalize beyond the subjects of the study.

200

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

Does blood type determine your personality? In a study aimed at answering this question, a random sample of 100 adults were given a personality test, and a comparison was made between the scores on the introversion/extroversion scale and blood type (A, B, AB, O).

No random assignment-->cause and effect cannot be inferred. Random sampling-->Can generalize to population from which the random sample was taken.

300

Two different organizations conduct polls in a city whose mayor has been accused of taking bribes. One poll asks a SRS of city residents, "Do you think the mayor should resign because of accusations of his criminal activity?" The other asks, "Do you think the mayor should resign?" The first poll concluded that the majority of city residents think the mayor should resign. The second poll drew exactly the opposite conclusion. Explain why their results might be so different.

The wording of the questions is different enough to produce different responses: mentioning bribery may cause a more negative reaction than not mentioning it, or some subjects might not even know about the accusations.

300

A church group interested in promoting volunteerism in a community chooses an SRS of 200 community addresses and sends members to visit these addresses during weekday working hours to inquire about the residents’ attitudes toward volunteer work. Sixty percent of all respondents say that they would be willing to donate at least an hour a week to some volunteer organization. Bias is present in this sample design. Identify the type of bias involved and state whether you think the sample percent obtained is higher or lower than the true population percent.

Sampling only during workday hours meant that only people without regular daytime jobs were available to answer the door—the poll suffered from undercoverage of people who were employed. Since those who are not employed may be more likely to have time to volunteer, the poll probably overestimated the proportion of potential volunteers. There is also potential response bias: a people is likely to say he or she will volunteer to look like a good person.

300

A family restaurant chain wants to test the market for a new menu item: a grilled chicken sandwich with chipotle salsa. They are interested in both how to market the item and the right price to charge for it. They decide to offer the sandwich at 60 different restaurants in the chain, using two different descriptions on the menu. Half the restaurants’ menus will emphasize ―healthy eating‖ and half will emphasize ―value.‖ These two groups of restaurants will be further divided in three groups, each charging either a High, Medium, or Low price for the sandwich. After a month, they will measure what proportion of customers order the new sandwich.

Suppose the company plans to conduct a completely randomized design. List the experimental units, factors, and treatments in this experimental design.

Experimental units: the 60 restaurants. Factors: menu description and price. Treatments: Healthy-High price, Healthy-Medium price, Healthy-Low price, Value-High price, Value-Medium price, Value-Low price.

300

The customer service call center for a major electronics manufacturer is trying to determine how to keep customers who are on hold as happy as possible. They want to examine whether the type of music they play while customers are on hold and whether or not there is a periodically-repeated recorded message (―Thank you for you patience, we’ll be with you as soon as possible.‖) have an impact on customer satisfaction. They plan to randomly select customers who are on hold and play one of three different types of music (―smooth‖ jazz, classical, or Broadway show tunes) and either play recorded messages or not. After the entire call is over, they will ask the customers to rate their overall customer service experience.

Suppose the company plans to conduct a completely randomized design. List the experimental units, factors and treatments in this experimental design.

Experimental units: customers who are put on hold. Factors: type of music, presence of recorded message. Treatments: jazz-message, classical-message, show tunes-message, jazz-no message, classical-no message, show tunes-no message.

300

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

Does ―Cold-Cut,‖ a popular over-the-counter cold remedy that claims to reduce the length and severity of colds really work? A consumer advocacy group addresses this question by asking a random sample of 400 adults how many colds they’d had in the last six months, how long each cold lasted, and if they took "Cold-Cut" to treat the cold.

No random assignment-->cause and effect cannot be inferred. Random sampling-->Can generalize to population from which the random sample was selected.

300

For the study described below, comment on the extent to which inferences can be drawn about a larger population and whether cause and effect can be established.

Does using a calculator improve understanding of mathematical concepts? All 200 fifth-graders at a school are randomly assigned to one of two groups. One group studies addition of fractions with the aid of a calculator, the other studies the same topic without a calculator. Scores on a fractions test are compared after two weeks.

Random assignment-->cause and effect can be inferred. No random sampling (in fact it was a census of all students at the school-->Cannot generalize beyond the students at the school.

400

Your school will send a delegation of 35 seniors to a student life convention. 200 girls and 150 boys are eligible to be chosen. If a sample of 20 girls and separate sample 15 boys are each selected randomly, it gives each senior the same chance to be chosen to attend the convention.

Is it an SRS? Explain.

No. Not every group of 35 seniors is equally likely to be selected. It’s impossible, for example, to have a group that is all girls. This is a stratified random sample.

400

Each state conducts an annual study of seat belt use by drivers following guidelines set by the federal government. Seat belt use is observed at randomly chosen road locations at random times during daylight hours. The locations are based on counties within each state. In Hawaii, the counties are the islands that make up the state’s territory, and the survey is conducted on the 4 most populated islands: Oahu, Maui, Hawaii (referred to as ―The Big Island‖), and Kauai. The sample sizes on the islands are proportional to the amount of road traffic., so each location is equally likely to be selected.

Is this a SRS of road locations in the state of Hawaii? Explain.

No. Not every group of locations is equally likely to be selected. It’s impossible, for example, to have a sample of locations that are all on Oahu. This is a stratified random sample.

400

A family restaurant chain wants to test the market for a new menu item: a grilled chicken sandwich with chipotle salsa. They are interested in both how to market the item and the right price to charge for it. They decide to offer the sandwich at 60 different restaurants in the chain, using two different descriptions on the menu. Half the restaurants’ menus will emphasize ―healthy eating‖ and half will emphasize ―value.‖ These two groups of restaurants will be further divided in three groups, each charging either a High, Medium, or Low price for the sandwich. After a month, they will measure what proportion of customers order the new sandwich.

Suppose that 30 of the restaurants in the study are free-standing buildings and the other 30 are located inside malls. The company suspects that the different building types may have impact on how people respond to the advertising campaign and the price. How might they alter the design of this experiment to take this into account?

Block for building type: randomly assign the six different treatments to the 30 free-standing buildings, the do the same to the 30 mall restaurants, so that there are exactly five of each building type assigned to each treatment.

400

The customer service call center for a major electronics manufacturer is trying to determine how to keep customers who are on hold as happy as possible. They want to examine whether the type of music they play while customers are on hold and whether or not there is a periodically-repeated recorded message (―Thank you for you patience, we’ll be with you as soon as possible.‖) have an impact on customer satisfaction. They plan to randomly select customers who are on hold and play one of three different types of music (―smooth‖ jazz, classical, or Broadway show tunes) and either play recorded messages or not. After the entire call is over, they will ask the customers to rate their overall customer service experience.

Suppose the company is concerned that the time when the call is made (daytime versus evening) will have an impact on which combination of music and messages is most effective. How might they alter the design of this experiment to take this into account?

Block for the time of day when the call is made: randomly assign equal numbers of customers who call during the daytime and customers who call in the evening to each treatment type, so that there are roughly the same number of customers from each time period in each group.

400

For the study describe below, comment briefly on the extent to which results can be generalized to some larger population, and the extent to which cause and effect has been established.

A marketing executive who wants to gauge reactions to a new packaging design for a popular brand of cookie places the new packages in 45 randomly-selected grocery stores in a large city and compares sales of the cookies to sales of the same cookie (with the old packaging) in the previous month.

Since packaging type is confounded with time, cause and effect cannot be inferred: we cannot separate the effect of packaging from differences in sales from last month to this month. We can, however, make inferences about the population of all stores in this city, since random sample of stores was used.

400

For the study describe below, comment briefly on the extent to which results can be generalized to some larger population, and the extent to which cause and effect has been established.

A consumer advocacy organization wants to determine if using premium gasoline in the engines of cars improves gas mileage. They randomly select 40 makes and models of new cars and acquire two of each. They run each car on a track for 1000 miles, one with regular gasoline, one with premium. (Which car within each pair gets the premium gas is determined by coin flip). After driving each car, they determine the difference in fuel consumption within each pair of cars.

Random assignment within matched pairs-->cause and effect can be inferred. Random sampling of cars-->Can generalize to population of all cars.

500

Your school will send a delegation of 35 seniors to a student life convention. 200 girls and 150 boys are eligible to be chosen. If a sample of 20 girls and separate sample 15 boys are each selected randomly, it gives each senior the same chance to be chosen to attend the convention.

Beginning at line 108 in the random digits table, reproduced below, select the first three senior girls to be in the sample. Explain your procedures clearly.

108 60940 72024 17868 24943 61790 90656 87964 18883

109 36009 19365 15412 39638 85453 46816 83485 41979

110 38448 48789 18338 24697 39364 42006 76688 08708

Make sure assigned digits are all the same length and that sampling is done without replacement. If we selected the first three 3-digit numbers between 001 and 200, they will be 179, 090, an 009.

500

Each state conducts an annual study of seat belt use by drivers following guidelines set by the federal government. Seat belt use is observed at randomly chosen road locations at random times during daylight hours. The locations are based on counties within each state. In Hawaii, the counties are the islands that make up the state’s territory, and the survey is conducted on the 4 most populated islands: Oahu, Maui, Hawaii (referred to as ―The Big Island‖), and Kauai. The sample sizes on the islands are proportional to the amount of road traffic., so each location is equally likely to be selected.

Suppose there are 476 possible road locations on Kauai and we need to randomly select 22 of them to be in the sample. Beginning at line 120 in the random digits table below, choose the first 3 road locations for the seat belt survey sample. Explain your method clearly.

120 35476 55972 39421 65850 04266 35435 43742 11937

121 71487 09984 29077 14863 61683 47052 62224 51025

122 13873 81598 95052 90908 73592 75186 87136 95761

Make sure assigned digits are all the same length and that sampling is done without replacement. If we selected the first three 3-digit numbers between 001 and 476, they will be 354, 239, and 421.

500

A college fitness center offers an exercise program for staff members who choose to participate. The program assesses each participant’s fitness using a treadmill test, and also administers a personality questionnaire. There is a moderately strong positive correlation between fitness score and score for self-confidence. Explain why it would not be possible to conclude from this study that the exercise program increases one’s self-confidence.

Since this was an observational study, we cannot establish cause and effect. It’s possible, for instance, that people with more self-confidence are more likely to choose to exercise. That is, people’s personality might be a confounding variable.

500

Many utility companies have introduced programs to encourage energy conservation among their customers. An electric company considers placing electronic meters in households to show what the cost would be if the electricity use at that moment continued for a month. It gives these meters to 100 of its customers for a year and then compares the average electricity use in these customers’ homes this year to the previous year. Result: These customers’ average electricity use decreased by 10%. Explain why this is not strong evidence that the use of the electronic meters caused customers to decrease their electricity use.

The effect of the electric meters is confounded with year, so we can’t be sure that the meters were the cause of the reduced electricity use. Perhaps the second year was not a cold as the first, so less electricity was used for heating. In this case, temperature would be a confounding variable.

500

Preliminary observational studies have linked consumption of caffeine during pregnancy to a higher incidence of miscarriages. It would be unethical to run a controlled experiment to establish cause and effect in this situation. Describe two ways in which researchers can seek to establish cause and effect that do not involve experiments.

(Answer may vary) Good general categories: establish a strong association between caffeine consumption and miscarriages in a wide variety of studies; establish a plausible mechanism for the impact of caffeine on miscarriages; show the association exists in studies that stratify for possible lurking variables, such as other health factors that may be confounding with caffeine consumption.

500

A few studies have suggested that people who live within a few hundred yards of high-voltage power lines are more likely to get certain forms of cancer. It would be both unethical and impractical to conduct a controlled experiment to establish cause and effect in this situation. Describe two ways in which researchers can seek to establish cause and effect that do not involve experiments.

(Answer may vary) Good general categories: establish a strong association between proximity to power lines and cancer in a wide variety of studies; establish a plausible mechanism for the impact of power lines on cancer; show the association exists in studies that stratify for possible lurking variables, such as other environmental factors that may be confounded with proximity to power lines.