Data Displays
SOCS stuff
Correlation
Studies
Experimental Design
100

Draw a boxplot for the data:  1, 3, 5, 6, 7, 10, 13, 17, 20

100

A graph shows two peaks. What is this shape called, and what might it indicate?

Bimodal

100

Correlation does not equal ____________ except when you do an _____________.

causation; experiment with randomly assigned treatments

100

What is the difference between an observational study and an experiment?

Experiments assign treatments, observational studies don't.

100

Describe a completely randomized design.

Every subject gets their name in a hat and could be chosen for either testing group.

200

Make a stem-and-leaf plot for the data: 9, 14, 15, 5, 18, 19, 19, 20, 21, 23, 25, 33

200

What is standard deviation?

Average distance from the mean

200

What is the symbol for correlation? What is its range?

r; -1 to 1
200

What are experimental units?

Subjects of an experiment

200

Describe a randomized block design.

Subjects are first grouped (i.e. blocked) and then treatments are randomly assigned within each block

300

Which measures of center and spread are resistant to outliers?

Median, IQR

300

What are three ways you might identify outliers?

Gaps, 1.5 IQR rule, 2 SD rule

300

Does an outlier make r increase or decrease? Explain.

It depends. If r is negative, it increases it. If r is positive, it decreases it.

300

What is blinding? Double-blinding? Placebo?

subjects don't know they're getting treatment, subjects and researchers don't know; a fake treatment

300

Describe a matched pairs design.

Subject is paired to their most like subject and treatments are assigned to one of the subjects in each pair.

400

When is it better to use median and IQR instead of mean and standard deviation?

When data is skewed or has outliers

400

A data set has a mean of 60 and standard deviation of 10. Would 85 be considered an outlier? Why?

Yes. It is more than 2 SDs above the mean.

400

What is the difference between an influential point and an outlier?

Outliers are far from the LSRL vertically, but influential points might be far from the rest of the data horizontally.

400

What is nonresponse bias? Give an example.

When a certain segment of the population chooses NOT to respond.

400

What is a cluster sample?

Population is naturally divided into clusters. Clusters are randomly chosen and everyone in the cluster is sampled.

500

An LSRL is used to predict a person's weight given their height. Interpret a residual of -10.2

This person weighs 10.2 pounds less than what was predicted for their height.

500

Are there any outliers by the 1.5IQR rule? 1, 3, 5, 6, 7, 10, 13, 17, 32

Yes. 32 is above upper fence of 31.5

500

Define r-squared.

The percent of variation in the y-variable that can be attributed to the linear model that uses the x-variable.

500

What is a confounding variable?

A variable that is not being assigned or recorded yet is having an affect on the response variable.

500

What is a systematic random sample?

Sample every nth person in the population.

600

How do we interpret the slope of an LSRL?

For each additional x-unit the y-variable increases/decreases by approximately the slope.

600

What a residual? What's the formula?

actual - predicted
600

What type of hypothesis test is used for linear regression? How many degrees of freedom does it have?

t-test for a slope; df = n - 2

600

What is a conveneince sample?

A sample that was easy to get and was NOT random.

600

What is a stratified sample?

The population is grouped by natural strata. An SRS is taken within each strata.

M
e
n
u