This is what you get when you sum all values in a data set and divide by the number of values in that data set.
What is the mean?
This type of distribution looks the same on either side of the center, and the mean is approximately equal to the median.
What is symmetric?
This type of data visualization creates a point for each (x,y) pair of values.
What is a scatterplot?
This is the most basic sampling technique, where all possible individuals have the same chance of being selected.
What is simple random sampling?
This is the sum of all probabilities in a scenario.
What is 1? (or what is 100%?)
This value in a dataset has half of the values above it and half of the values below it.
What is the median?
This type of distribution has high outliers, leading to a tail on the right side of the distribution curve.
What is skewed right?
We use this word to describe how two variables are related. It may be strong or weak; it may be negative or positive.
What is correlation?
This is a type of study which does not involve controlling any variables.
What is an observational study?
Every probability should be between these two numbers.
What are 0 and 1?
This diagram shows the relative frequencies or counts along a y-axis of categorical values in a data set.
What is a bar chart?
95% of the data in a distribution fall between this many standard deviations of the mean.
What is two?
This is a linear equation computed to represent the best fit between the two variables. It can be used to predict future data points.
What is the least squares regression line?
We implement controls in a study to minimize the effect of these.
What are confounding variables?
This is when two events have no overlap.
What is mutually exclusive?
This is a measure of spread based on minimizing the squares of the distances from the mean. It is a common way to describe the variation in a set of values.
What is the standard deviation?
This descriptor tells you what percentage of the data is below a specific value.
What is a percentile?
This is the error (vertical distance) between a linear model's prediction and the actual data point.
What is the residual?
If it's not minimized through randomization, the sample may not share the same characteristics as the population.
What is bias?
This is the average outcome over many trials.
What is expected value?
This diagram clearly displays the four quartiles of a distribution.
This descriptor tells you how many standard deviations above the mean a particular value is.
What is a z-score?
This is a mantra that should be one of your biggest takeaways from this class.
What is "Correlation ≠ Causation" ?
This experimental design ensures that no one knows who got which treatment.
What is a double-blind study?
This is a diagram that shows possible outcome for successive events.
What is a tree diagram?