Sampling/Experiments
Exploring Data
Probability
Inference
Wild Card
100

This is when we systematically tend to overestimate or underestimate the true population parameter

Bias

100

This is a number between 0 and 1 that gives the strength and direction of the relationship between 2 quantitative variables

Correlation (r)

100

This law states that simulated (empirical) probabilities tend to get closer to the true probability of an event as the number of trials increases.

The Law of Large Numbers

100

Based on these two values, we either reject or fail to reject the null hypothesis.

The significance level (alpha) and the p-value

100

What's the difference between a parameter and a statistic?

A parameter is a number that describes the population

A statistic is a number that describes the sample 

200

What is the difference between an observational study and an experiment?

Experiments impose treatments, Observational studies do not

200

What is the difference between categorical and quantitative variables?

A categorical variable takes on values that are category names or labels. A quantitative variable takes on numerical values for a measured or counted quantity.

200

In this setting, there are two outcomes for each trial (success or failure), each trial is independent of the next, the number of trials is fixed, and the probability of success is the same for each trial.

Binomial Setting

200

What is the confidence interval formula for a one-sample t-interval for 

\mu ?

\bar x +- t^* (s/\sqrt(n))

200

What is the formula for a the Geometric probability P(X=x) where X:= the number of trials until the first success?

P(X=x)=(1-p)^(x-1)*p

300

When can we make conclusions about cause and effect?

When researchers randomly assign subjects to treatment groups (in an experiment)

300

When is a data point considered an outlier? (Do not just say "when it is unusually larger or smaller than the other points" - be specific!)

A data point is considered an outlier either when it is 1.5 * IQR above or below Q3 or Q1 respectively or when is more than 2 standard deviations away from the mean.

300

What is the formula for and interpretation of a z?

z = (value - mean) / (standard deviation)

A z score tells us how many standard deviations a value is away from the mean

300

What is the probability that a specific confidence interval captures the population parameter?

0 or 1! (Do NOT say the confidence level!)

300

How do we interpret the slope of a Least Squares Regression Line (LSRL)?

For every 1 unit increase in x (in context), the predicted y (in context) increases/decreases by the slope.

400

What is a randomized block design and what is its purpose?

For a randomized block design, treatments are assigned randomly within each block. For each block, individuals are similar to each other.

The purpose of blocking is to reduce variability of results within each treatment group and eliminate possible confounding variables

400

How do we describe the relationship between tow variables (like in a scatterplot)?

Address the relationship's direction (positive or negative), unusual values (outliers, influential points), Form (linear, non-linear), and strength (weak, strong)

400

For the random variables X and Y, what is the formula for the mean and standard deviation of X - Y? Answer in terms of 

\mu _X , \mu _Y, \sigma_X, and \sigma_Y

\mu_D = \mu_X - \mu_Y

\sigma_D = \sqrt(\sigma_X^2 + \sigma_Y^2 )

400

What are the *specific* conditions for a two sample z-test for p1 - p2?

Random: Data come from independent random samples or 2 groups in a randomized experiment

10%: when sampling without replacement, n < 10% of the population size for both samples

Normality:  n_1\hat p _c >= 10 , etc... (use the large counts rule with the pooled p-hat value)

400

The Daily Double! Is Mr. K a good Teacher?

 (True or False)

True 

500

What four elements should a well-designed experiment include?

Comparison of at least two treatment groups, random assignment of treatments to experimental units, replication, control of potential confounding variables

500

How do we describe a distribution (for one variable)?

Comment on shape (symmetry, modality, etc.), center (mean or median), spread (or variability - standard deviation, range, IQR), and unusual features (outliers, gaps, clusters).

500

Can two mutually exclusive events also be independent? Give an example to support your answer

No, two mutually exclusive events cannot be independent. Example: For a single coin flip, let H = heads and T = tails

500

What 2 - 3 questions can we ask to determine the correct inference procedure?

Does the scenario describe mean(s), proportions(s), counts, or slope?

Does the scenario describe one sample, two samples, or paired data?

Does the scenario describe a test or a confidence interval?

500

What is the difference between the population distribution, the sample distribution, and the sampling distribution?

The population distribution is the distribution of responses for every individual in the population.

The sample distribution is the distribution of responses for a single sample.

The sampling distribution is the distribution of values for the statistics for all possible samples of a given size from a given population.

M
e
n
u