Which data displays are used for QUALITATIVE data?
Pie charts and bar graphs.
How do you display quantitative data?
Stem and leaf
Dot plots
Box plots
What are the average measures of central tendency?
Mean, median, and mode.
What is a histogram, and how is it used for data display?
A graph used to visualize frequency distributions.
What do you need to know/remember?
You need to remember how to identify and describe the shape, outliers, measure of center, and how to calculate.
How do you make a bar graph?
Draw and label the axes
Horizontal axis - name of category
Vertical axis - the frequency (count) or relative frequency (percent or population)
Scale the axes
EQUALLY spaced intervals for the horizontal axis
On the vertical axis, start at 0 and place EQUALLY spaced tick marks until you exceed the largest frequency or relative frequency in any category
Draw
Draw bars above the category names. Make the bars equal in width and leave gaps between them
What is the 5-number summary?
The 5 number shows the distribution of
QUANTITATIVE data.
It consists of:
Minimum
Quartile 1 (Q1)
Median (Quartile 2 – not used, MD)
Quartile 3 (Q3)
Maximum
What are the measures of center?
Mean
Median
Mode
Grouped mean
Weighted mean
What do histograms do?
Shows each interval as a range of data values
The heights of the bars show the frequencies of
values in each interval.
Bars MUST touch
-Refresher; parentheses () mean excluded, brackets
[ ] means inclusive
- Quantitative Data
Shape
What does it look like?
Identify mode
Unimodal, bimodal, no mode
Describe shape
Symmetrical, approximately normal (bell curve), unimodal, bimodal, skewed
Uniform if mostly flat no mode same size bars
Trace bars to show the shape
Shape has a DIRECT relationship with center
How can you use side-by-side bar graphs?
Can be used to compare categorical variables in two or more groups.
Bars are grouped by values of one categorical variable.
How do you make a dot plot?
Draw a horizontal axis (a number line) and label it with the variable name
Scale the axis from the minimum to the maximum value. Make sure to even space intervals
What are the properties of the mean and median?
Uses all data values.
Varies less than the median or mode
Used in computing other statistics, such as the variance
Unique, usually not one of the data values
Cannot be used with open-ended classes
Affected by extremely high or low values, called outliers
Gives the midpoint (exact middle of the data)
Used when it is necessary to find out whether the data values fall into the upper half or lower half of the distribution.
Can be used for an open-ended distribution.
Affected less than the mean by extremely high or extremely low values.
How do you make a histogram?
Choose equal-width intervals that span the data. Five intervals are a good minimum if the amount is not given
Make a table to show the frequency of data values in each interval. (Frequency Chart)
Draw and label the axes. Put the name of the quantitative variable under the horizontal axis. Label the vertical axis frequency.
Place equally spaced tick marks at the smallest
value in each interval along the horizontal axis.
Start at 0 on the vertical axis and place equally
spaced tick marks until you exceed the largest
frequency in any interval.
Outlier
From a graph
Does anything look unusual? High or low?
Can use histogram, box plot, or stem/leaf
Proof
1.5 (IQR) to find the upper or lower boundaries
Q1 - 1.5 (IQR) LOWER
Q3 + 1.5(IQR) UPPER
How do you make a segmented bar?
Find the relative frequency for each sample of each category
Most often rounded to the nearest tenth
Stack the percentages vertically, remember you should always have 100%
Adding of raw values in the stack should = total amount
How do you make a box plot?
Order the data from least togreatest
Find the minimum, maximum, andmedian
Q1 is the median of the lower half of the data, not including the median
Draw and label the axis
Draw the five vertical lines above the axis that corresponds to the five-number summary
Connect the middle 3 verticallines to form a box
What is variation?
The variance is the average of the squares
of the distance each value is from the
mean.
Histogram example
Find the class width by dividing the range by the number of classes.
Range = high – low
Width = range / number of classes
Rounding rule: ALWAYS round up if there is
a remainder
Use the width to find the class limits -> start, stop
Find the class boundaries (plus and minus 0.5) to the limits of the classes
It is midway between the upper-class limit and the subsequent lower class limit
Class limit: 100 – 104
Class boundary: 99.5 – 104.5
Center
Which is best?
Mean -
Median -
Data
How do you make a pie chart?
Sum the values
Find the relative frequency for each category
Multiply that percentage by 360 to find the degrees
A cirlce has 360 degreees total
How do you make a stem and leaf plot?
Separate each observation into a stem (all but the final digit) and a leaf (the final digit)
Write all possible stems from the smallest to the largest in a vertical column, and draw a vertical line to the right of the column.
Write each leaf in the row to the right of its stem
Arrange the leaves in increasing order out from the stem
What are the definitions and uses of variance and standard deviation?
The standard deviation is the square root of the variance. This is the distance you are away from the mean.
To determine the spread of the data.
To determine the consistency of a variable.
To determine the number of data values that fall within a specified interval in a distribution
What are the things to remember?
Histograms are for quantitative data, bars are for
qualitative
Histograms horizontal axis identifies values of the variable, the horizontal axis identifies categories
Remember Histograms touch and Bars don’t
Use percents or proportions with histograms on the vertical axis instead of counts when making comparisons of distrubutions when making comparisons
Use the width to find the class limts -> start, stop
Spread
How much variation is there in the data?
3 types