How do you find the median if there is an even number of data points?
You find the average of the two middle numbers
What is the formula for finding a Z-score?
(X - μ) / σ
What are the two types of variables in a scatterplot?
Explanatory and response
Response is dependent, explanatory is independent
An "individual" is...
An object described by a set of data
Name the 4 types of biases in sampling
- Response
-Non-response
-Wording effects
-Undercoverage
What is the median of this data set?
{12, 23, 34, 45, 54, 56, 67, 78, 89, 98}
55
(54 + 56) / 2
= 110 / 2
= 55
In a normal distribution, what is the probability that a randomly selected data point falls within one standard deviation of the mean?
68%
(68, 95, 99.7% rule)
How do you find R2?
r2
Correlation coefficient squared
What is the difference between categorical and quantitative variables?
A categorical variable places an individual into a group (categorizes)
A quantitative variable uses numerical values to charaterize an individual
Observational study vs. Experiment
Observational Study: Observes individuals, does not influence the response.
Experiment: Deliberately imposes treatment(s) to see responses of individuals.
Given this 5 Number Summary, find the IQR
Min = 5
Q1 = 14
Median = 32
Q3 = 44
Max = 53
IQR = 30
What can you find with a z-score
(Hint: Which side is it?)
Use Table A to get the proportion of data to the left of x
What two numbers is the correlation always going to be between?
Bonus: What do both extremes mean?
-1 and 1
-1 means it is a perfect negative correlation
1 means it is a perfect positive correlation
A researcher wants to use education level to explain differences in income.
What is the explanatory variable? What is the response variable?
Explanatory: Education level
Response: Income
Name the types of bad sampling designs.
(that we talked about)
- Convenience (the easiest to reach)
- Voluntary response (opt-in surveys & studies)
The median is a better measure of center than the mean when the data is
Skewed/has many outliers
What are the attributes of a Normal distribution?
Single peaked, symmetric, area underneath =1
What is the equation for the regression line? And what are the equations for its components?
ŷ = mx + b
m = r (sy/sx)
b = ȳ − m.
When is the distinction between explanatory and response variables essential?
When creating a least-squares regression line
Matched Pairs vs. Block design
Block design is matched pairs, but in larger groups.
Matched Pairs: Pairing individuals that are similar to observe the effects of different treatments on similar subjects.
Block Design: Group of individuals that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.
What is the rule (in this class) for determining if a data point is an outlier?
1.5 * IQR rule.
Q3 + 1.5 * IQR < x, x is an outlier
Q1 - 1.5 *IQR > x, x is an outlier
What is the official name of the number you get out of Table A?
How do you find a residual, and what does it tell you?
Tells us how "off" our prediction was.
What does "confounded" mean?
When the effects of an explanatory and lurking variable on a response variable are indistinguishable from each other.
Notation for the sample standard deviation and mean.
s and 