Types of DATA
Central Tendency
The Normal Distribution
Sampling Methods
Correlation
100

This type of variable describes data that falls into groups or labels, such as "Hair Color" or "Gender," rather than numerical measurements.

Answer: What is categorical (or qualitative) data?

100

Question: This measure of center is calculated by adding all the values in a data set and dividing by the number of observations.

Answer: What is the mean?

100

Question: In a Normal distribution, approximately this percentage of data falls within one standard deviation of the mean.

Answer: What is 68%?

100

This "gold standard" sampling method gives every possible group of $n$ individuals in the population an equal chance of being chosen.

What is a Simple Random Sample (SRS)?

100

This value, symbolized by the letter $r$, measures the strength and direction of a linear relationship and always falls between $-1$ and $1$.

What is the correlation coefficient?

200

This type of graph displays a five-number summary, consisting of the minimum, first quartile, median, third quartile, and maximum.

What is a boxplot (or box-and-whisker plot)?

200

This term describes two events that cannot occur at the same time, meaning the probability of their intersection is zero.

What is mutually exclusive (or disjoint)?

200

This is a fake treatment used in experiments, such as a sugar pill, given to the control group to account for the psychological effect of receiving a treatment.

What is a placebo?

200

This specific probability distribution is used when there are a fixed number of independent trials, each with only two possible outcomes (success or failure).

What is a Binomial distribution?

200

If a distribution has a long tail stretching out toward the higher values on the right, it is described by this two-word term.

What is skewed right?

300

This value represents the number of standard deviations an observation falls above or below the mean, calculated as $(x - \mu) / \sigma$.

What is a z-score?

300

This rule states that for any two events A and B, the probability of A or B occurring is P(A) + P(B) - P(A and B).

What is the General Addition Rule?

300

In a least-squares regression line, this specific point is the difference between an observed y-value and the y-value predicted by the model.

What is a residual?

300

This occurs when the effects of two variables on a response variable cannot be distinguished from each other, often involving an outside "lurking" variable.

What is confounding?

300

According to this fundamental theorem, as the sample size n increases, the sampling distribution of the sample mean becomes approximately Normal, regardless of the population's shape.

What is the Central Limit Theorem (CLT)?

400

This is the probability that event A occurs, given that event B has already occurred; it is calculated by dividing P(A U B) by P(B)

What is conditional probability?

400

To use a Normal distribution for a sampling distribution of a proportion, this condition requires that both np and n(1-p) are at least 10.

What is the Large Counts condition (or Success/Failure condition)?

400

In a contingency table, this type of distribution describes the values of a single variable among all individuals, often found in the "total" row or column.

What is a marginal distribution?

400

To find the standard deviation of the sum of two independent random variables, you must first add these measures of spread together and then take the square root.

What are variances?

400

This type of error occurs when a researcher fails to reject a null hypothesis that is actually false (a "false negative").

What is a Type II error?

500

This value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming the null hypothesis is true.

What is the p-value?

500

Defined as $1 - \beta, this term represents the probability that a test will correctly reject a false null hypothesis.

What is power?

500

When the population standard deviation is unknown and we use the sample standard deviation instead, the resulting measure of variability for the sampling distribution is called this.

What is the standard error?

500

This technique is used to ensure that the groups in an experiment are as similar as possible by grouping subjects with shared characteristics before randomly assigning treatments.

What is blocking (or a randomized block design)?

500

In a confidence interval for a mean, this specific component—represented by the product of the critical value and the standard error—determines the width of the interval on either side of the point estimate.

What is the margin of error?

M
e
n
u