When Means Mislead
Reliability of Measures
Time Series Insight
Correlation Pitfalls
Beyond the Numbers
100

Two samples have the same mean but very different boxplots. What does this show?

Mean alone may be misleading; variability differs

100

Which is more sensitive to extreme values: variance or IQR?

Variance.

100

What does a steady upward trend in a time plot suggest?

Improving performance or growth

100

If r = 0, what does that mean?"

No linear relationship (but nonlinear may exist).

100

Why are plots essential before statistical modeling?

They reveal patterns and anomalies quickly

200

A dataset has extreme outliers. Which measure of center is more robust?

Median

200

Why might range be unreliable?

It only considers two values: min and max.

200

Why are seasonal cycles important in time data?

They reveal periodic patterns affecting interpretation.

200

Two variables have r = 0.9 but scatterplot shows a curve. What’s wrong?

Correlation only measures linear association.

200

Why does variability matter in engineering?

High variability means less predictable, less reliable systems

300

If two datasets share the same mean but one is skewed, what does this imply?

Distribution shape matters; mean may not represent the data well

300

Which is more stable across samples: standard deviation or range?

Standard deviation.

300

A sudden level shift in a time series suggests …

A structural change in the process.

300

Why is ‘correlation ≠ causation’ important?

High correlation may be spurious; not evidence of cause-effect.

300

Why compare multiple displays (histogram, boxplot, etc.)?

Each emphasizes different features of the data.

400

Why might the median be better than the mean for income data?

Income data are often skewed; the median resists outliers

400

What makes IQR useful in skewed data?

It captures central spread without being affected by extremes

400

Combining a stem-and-leaf with time sequence gives what display?

A digidot plot.

400

What does a nearly perfect correlation (r ≈ 1) between color and density in wine data imply?

The two variables are almost redundant.

400

How can descriptive statistics guide material selection?

They summarize strength, variability, and reliability

500

What is the danger of summarizing data with only the mean?

It hides spread, skewness, and outliers.

500

If s² consistently underestimates variability, what adjustment is made?

Divide by n-1 (degrees of freedom).

500

Why plot data in time order before summary stats?

To detect shifts, cycles, or trends missed by summaries

500

Correlation of –0.7 suggests what?

Strong negative linear relationship

500

Why is descriptive analysis always the first step in data science?

It helps understand data structure before advanced modeling.