Organizing Data
Data Relationships
Definitions
Variables
Sampling
100
This measure of center is more resistant to outliers than the mean.
What is the median?
100
observed y - predicted y
What is the residual?
100

The difference between the third and first quartiles.

What is the IQR (Interquartile Range)?

100

Takes on values that are category names or group labels.

What is a categorical variable?

100

Introduces potential for bias because it does not use chance to select the individuals.

What is convenience sampling or voluntary response sampling?

200
To calculate, subtract the mean of the distribution from the observed x, then divide by the standard deviation.
What is the z-score (or standardized value)?
200
Measures the direction and strength of a linear relationship between two quantitative variables.
What is correlation (or r)?
200
randInt(1,9,3)
What is the calculator command for generating 3 random numbers from 1 to 9?
200

Takes on a countable number of values. The number of values may be finite or infinite.

What is a discrete variable?

200

The systematic tendency to overestimate or underestimate the true population parameter.

What is bias?

300
This rule helps to determine if data is normally distributed by checking the number of observations within each interval.
What is the 68-95-99.7 rule?
300
The fraction of the variables in the values of y that is explained by the LSR of y on x.
What is the coefficient of determination (or r squared)?
300

In regression, a point that does not follow the general trend shown in the rest of the data and has a large residual.

What is a regression outlier?

300

One that takes on numerical values for a measured or counted quantity.

What is a quantitative variable?

300

A sample in which every group of a given size has an equal chance of being chosen. 

What is a simple random sample (SRS)?

400
The square of the standard deviation.
What is the variance?
400
Applying a logarithmic transformation to both variables causes this type of model to become linear.
What is a power model?
400

Any value that falls more than 1.5IQR above Q3 or below Q1. 

What is the outlier rule?

400

Takes on infinitely many values, but those values cannot be counted. 

What is a continuous variable?

400

Involves the division of a population into separate groups, called strata, based on shared attributes or characteristics (homogeneous grouping.)

What is a stratified random sample?

500
This calculator command can be used to find the area under a normal distribution and above an interval.
What is normalcdf?
500

A point in regression that has a substantially larger or smaller x-value than the other observations have.

What is a high-leverage point?

500

In regression is any point that, if removed, changes the relationship substantially (creates big changes to slope and/or y-intercept). Outliers and high-leverage point are are often influential.

What is an influential point?

500

Not affected by other variables in a study and is manipulated to see if it affects other variables,

What is an independent variable?

500

Write each subject's name on equal sized slips of paper. Put all the slips of paper in a hat. Mix well. Select as many names needed for each treatment group, without replacement.

What is how to carry out a random assignment by selecting from a hat?