Analyzing Data
Describing Data
Modeling Data
The Normal Distribution
2 Variable Data & Regression
100

What type of variable represents numerical data where arithmetic operations make sense?

Quantitative variable.

100

What does SOCS stand for when describing a distribution?

Shape, Outliers, Center, Spread (plus context).

100

What does a percentile represent?

The percentage of observations below a given value.

100

What is the area under the entire normal curve equal to?

1 (or 100%).

100

What type of graph is used to display two quantitative variables?

A scatterplot.

200

What is the difference between a response variable and an explanatory variable?

The explanatory variable helps explain or predict changes in the response variable.

200

Which measure of center is resistant to outliers: mean or median?

Median.

200

What is a z-score, and what does it tell you?

z = (x − mean)/SD; tells how many SDs an observation is from the mean.

200

A normal distribution has mean = 50 and SD = 10. What z-score corresponds to a value of 70?

z = (70 − 50)/10 = 2.

200

In a regression equation y-hat = a·x + b, what does the slope represent?

The predicted change in y for each unit increase in x.

300

Which graph would you use to display the relationship between two categorical variables?

Mosaic plot or segmented bar graph.

300

How do you find outliers using the 1.5×IQR rule?

Values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR.

300

How does adding a constant affect the mean and standard deviation?

Changes the mean but not the standard deviation.

300

If a student’s z-score on a test is −1.5, what does that tell you about their performance?

The student scored 1.5 SDs below the mean.

300

If correlation (r) = 0, what does that indicate about the relationship between x and y?

There is no linear relationship between them.

400

What is the main advantage of a dot plot or stem-and-leaf plot over a histogram?

They display actual data values, not just frequencies.

400

If the data are skewed right, how do the mean and median compare?

Mean > Median.

400

How does multiplying data by a constant affect the mean and standard deviation?

Multiplies both the mean and SD by that constant.

400

Approximately what percent of observations fall between z = −2 and z = 1?

About 81.5% (from 2 SDs below to 1 SD above the mean).

400

How does an influential outlier affect the regression line?

It can significantly change the slope, y-intercept, and correlation coefficient (r) of the line.

500

How might a bar graph be misleading?

if the y-axis doesn’t start at zero

500

Two data sets have the same mean but different standard deviations. What does this tell you about how the data are distributed?

It means both sets are centered at the same value, but the one with the larger standard deviation has more variability — its data are more spread out around the mean.

500

What is a relative frequency graph used for?

Showing proportions or percentages instead of counts.

500

The distribution of IQ scores is normal with mean 100 and SD 15. What IQ corresponds to the 90th percentile?

z ≈ 1.28 → IQ = 100 + (1.28)(15) ≈ 119.2.

500

The correlation between study hours (x) and exam score (y) is 0.8. If SDₓ = 2 and SDᵧ = 10, what is the slope of the regression line?

a = r × (sᵧ/sₓ) = 0.8 × (10/2) = 4.0.