This number is the absolute middle of any set of data
What is the median
100
All the area under a density curve equals this
What is 1
100
Definition of explanatory and response variable
What is explanatory- helps explain or influences change in the response variable
response- measures the outcome of a study
100
Two way tables are used to compute this
What is conditional distributions
100
Bias
What is what a sampling method has if it favors certain outcomes
200
A box plot shows these five major numbers of a data set
What are Minimum, Q1, median, Q3, and maximum
200
These are two symbols that stand for mean and standard deviation when dealing with density curves
What are μ and σ
200
The symbol that stands for correlation in a scatter plot and what that number can range between
What is r and -1 to 1
200
This special kind of bar graph can be used to express conditional distributions
What is a segmented bar graph
200
These two methods of sampling are biased by their definition
What are voluntary and convenience sampling
300
Histograms are made from this data table
What is a frequency table
300
Definition of a percentile
What is a value with p percent of observations less than or equal to it is the pth percentile
300
The formula for the residual of a value is
What is y-ȳ
300
Two variables are confounded when this happens
What is when the variables' effects on a response variable cannot be distinguished from each other
300
It is possible to give every member of the population an equal chance in a simple random sample using any of these three tools
What is a hat, a table of random digits, and a calculator using Rand int
400
This is the formula to determine if a point on a box plot is an outlier
What is 1.5(IQR) or 1.5(Q3-Q1)
400
The z-score formula
What is z=x-μ/σ
400
The Least Squares Regression Line formula and its meaning
What is ȳ=a+bx and it is the line that attempts to best predict the y values of a scatter plot given the x value
400
A residual plot has to have this kind of scatter to show that a linear model is best to represent a scatter plot's data
What is a random scatter of points
400
Both of these are examples of how a sample could be biased based on its choice of individuals
What is undercoverage or overestimation
500
SOCS stands for these 4 things you must identify on any graph of a data set.
What is shape, center, spread, and outliers.
500
The Empirical Rule of a normal distribution
What is + - σ from μ is 68%
+ - 2σ from μ is 95%
+ - 3σ from μ is 99.7%
500
r^2 is called this and shows this
What is the coefficient of determination and shows that "in the long run" r^2 % of the variation in the y values, will be explained by the variation in the x values
500
This is a type of variable that is not included as an explanatory or response variable in the analysis but can affect the interpretation of relationships between variables
What is a lurking variable
500
Explain how to conduct a simple random sample
First, divide the population into groups that each have a similar characteristic, called strata. Then select a simple random sample of each group and combine the simple random sample of all of your strata as your final sample individuals