Takes values that are category names or labels.
(What type of variable)
Categorical variable
What is a mean?
Average of group of numbers
What is the Z score formula?
X bar - X / standard deviation
How can we represent categorical data?
With a frequency table or relative frequency table.
takes numerical values for a measured or counted quantity.
(What type of variable)
Quantitative variable
What summary statistics can be used to describe the center and position of a distribution of quantitative data?
Center: mean, median
Position: Q1 and Q3
What is a percentile?
Percentile is the percent of data values less than or equal to a given value.
How can we represent categorical data graphically?
Bar Chart/pie chart
Types of quantitative variable which is countable with gaps.
Discrete variable
What summary statistics can be used to describe the variability of a distribution of quantitative data?
Variability: range, IQR, and standard deviation
What does a Z score tell us?
Number of standard deviations above and below the mean.
What are the important characteristics to discuss when describing the distribution of quantitative data?
Shape, center, variability
Types of quantitative variable which is not countable with no gaps
Continuous variable
What is the five-number summary and how do we use it to make a boxplot?
Minimum, Q1, median, Q3, maximum
Use the five-number summary to split data into quartiles.
What is the empirical rule?
About 68% of the data is within 1 SD of the mean.
About 95% of the data is within 2 SD of the mean.
About 99.7% of the data is within 3 SD of the mean.
How can we determine if a value in a data set is an outlier?
less than 1.5 X IQR below Q1 or more than 1.5 X IQR above Q3.
2 or more standard deviations away from the mean
What is one way we decide if there is a relationship between two categorical variables in a graphical representation.
Bar Graph
How does the shape of the graph influence the relative relationship of the mean and median?
Skewed right distribution, mean > median
Skewed left distribution, mean < median
How can we use the z-scores to find the percent of data values left, right, and between?
Left: get area from Table
Right: 1 — area from Table
Between: subtract two areas from Table
Which summary statistics are resistant, and which are nonresistant?
Resistant: median, lQR
Nonresistant: mean, standard deviation, range