The objects described by a set of data. May be people, animals or things.
What are Individuals?
Sketch a number line to display all values that occur.
Put a dot above its corresponding number.
Use for data sets that have integer values, or that have values that can be easily rounded.
Use for data sets of reasonable size- 40 or less.
What is a dot plot?
A way to organize responses to categorical data questions that may have some relationship or association.
What is a two-way table?
The sum of all values divided by the total number of observations.
What is the MEAN?
A graphical display built with the 5-number summary that gives representations of center and spread.
What is a BOXPLOT?
Any characteristic of an individual. Can take different values for different individuals.
What is a variable?
1. Divide the data into classes of equal width.
2. Find the count- frequency, or percent- relative frequency, for each class.
3. Label and scale and axes and draw your bars.
What is a Histogram?
The totals in a two way table that fall in the margins- the bottom most row and the furthest right column.
What is a MARGINAL DISTRIBUTION?
Order the observations from least to greatest. Find the middle term if there are an odd number of observations. For an even number of observations, take the mean of the two observations in the center.
How do you find the MEDIAN?
Q3 - Q1; a measure of the spread of the middle 50% of the data set
What is IQR?
Places individuals into one of several groups or categories. If numerical, it does not make sense to find an average.
What is a categorical variable?
For reasonable size data sets:
1. Order your data set from lowest to highest.
2. Separate each observation into a STEM- all but the final digit, and LEAF- the final digit.
3. Write the stems in a vertical column, smallest to largest, and draw a vertical line to the right.
Record the leaves in the row to the right of its stem, in order, least to greatest.
What is a STEM PLOT?
Describes the values of a variable among individuals who have a specific value of another variable. We restrict our study to just one row or column of data and a particular outcome in that row or column.
What is a conditional distribution?
The observation that occurs the most frequently.
What is the MODE?
A measure of the average distance from any point in the data set to the mean.
What is Standard Deviation?
Takes on numerical values for which it makes sense to find an average.
What is a quantitative variable?
Don't forget your SOCS!
Shape
Outliers
Center
Spread
How do we describe 1- variable distributions?
If knowing the value of one variable helps predict another.
What is an ASSOCIATION?
BONUS!!!!
Min Q1 Median Q3 MAX
What is the 5- number summary?
The data set is skewed left or right so the median is a better measure of center than the mean.
When should we use IQR as a measure of spread?
It tells us what values the variable takes on and how often it takes on those values.
What is a distribution?
One is a COUNT of observations that fall in some range or category
The other is a PERCENT.
What is the difference between FREQUENCY and RELATIVE FREQUENCY?
Pie Charts, bar charts- including side by side and segmented- and mosaic plots.
How do we show distributions of categorical variables graphically?
A better measure of center for a very skewed data set. It is more resistant to outliers.
What is the Median?
**BONUS!!
There are few in the skew.
How do we know whether a distribution is LEFT skewed or RIGHT skewed?