Descriptive Statistics

Numerical Summaries

Graphical Displays

Shape & Distribution

Percentiles & Quartiles

Relationships & Correlation

100

This value is the balance point of data, often called the arithmetic average

What is the sample mean

100

Graphical displays are critical to display this critical trends that summary statistics may miss

What are nonlinear relationships

100

This measure is resistant to skewness and outliers so it is best to represent household income in Michigan

What is Median

100

This value divides data so that 50% of observations are below it and 50% are above.

What is the median (2nd quartile, Q2)

100

A scatterplot shows points tightly grouped along an upward-sloping line. What does this suggest?

What is a strong positive relationship between the two variables

200

This measure shows how much data varies around the mean, in squared units.

What is the sample variance

200

This plot uses bins of equal width to show frequency.

What is a histogram

200

This type of distribution will show scores at say two peaks, one at 40 and another around 85.

What is Bimodal Distribution

200

This value has 25% of the data below it and 75% above it.

What is the first quartile (Q1)

200

Two variables show r ≈ 0, yet the scatterplot reveals a clear curved pattern. What does this mean?

What is that correlation only measures linear relationships and may miss nonlinear ones

300

This is the positive square root of the variance.

What is the sample standard deviation

300

In the boxplot, this will show as a separate point beyond the whiskers, drawing immediate attention.

What is an outlier

300

The display is regarded as this when it shows that the left tail is long, and the the order is mean less than median and median less than mode

What is left-skewed

300

The difference between Q3 and Q1 is called this measure of variability.

What is the interquartile range (IQR = Q3 – Q1)

300

An analyst reports that “high ice cream sales cause more drowning accidents” because both increase together. What’s the flaw?

What is confusing correlation with causation; a lurking variable (temperature) explains both

400

This value is the difference between the largest and smallest data points.

What is the sample range

400

These charts are suitable for continuous data

What are histograms and boxplots

400

Even with symmetry, the presence of these is this is visually highlighted in boxplots

What are outliers

400

A boxplot shows Q1 = 20, Q2 = 25, Q3 = 35. What is the IQR, and what does it represent?

15; it represents the spread of the middle 50% of the data.

400

Two engineers compare correlations: Dataset A has r = 0.85, Dataset B has r = 0.45. Without computations, what can we infer?

Dataset A shows a stronger linear association, while Dataset B has a weaker, less consistent relationship.

500

This concept refers to the number of independent pieces of information in data.

What are degrees of freedom

500

A a school official wants to use this chart to display the the percentage of students in majors

What is a pie chart

500

These shapes will match these data: Lifespans of lightbulbs, Exams scores in a very easy test, Height of adult humans, and Rolling of a fair die

What are Normal, right skewed, left skewed, and uniform

500

A dataset has 80 values. To find Q1, you compute (n+1)/4 = 20.25. Explain how to obtain Q1.

Q1 is between the 20th and 21st ordered values; interpolate: Q1 = value20 + 0.25 × (value21 – value20).

500

A dataset shows r ≈ –0.9 between material flexibility and strength. How should this guide engineering design decisions?

It suggests a strong inverse relationship — as flexibility increases, strength decreases — requiring a trade-off decision in material selection.