What is categorical data?
What is quantitative data?
- Categorical data is names, labels, and categories.
-Quantitative data is numerical values, measurements or counts.
What graphs do you use to display quantitative variables?
And what are they?
Dot plot- Shows each data value as a dot above its location on the number line
Histogram- Shows each interval as a bar, heights of bar can be frequencies (counts) or rel. frequency. Gaps help to identify outliers.
Stem and leaf plot
1. how to find IQR?
2.what is the outlier rules?
3. What is the line in the middle of the box in a box plot?
1. IQR= Q3 - Q1
2. way too small: Q1 - 1.5(IQR)
way too big: Q3 + 1.5 (IQR)
3. The median.
What happpens to the center, shape and variability when you add or subtract the constant/scores?
Shape: stays the same
Center: mean/median plus or minus by constant
variability: stays the same
what are words to use when comparing distrubitions?
greater than, less than, similar to, higher, lower, larger, smaller.
What are the two types of quantitative data and what does it refer to?
Two types: Discrete & Continuous
Discrete: COUNTABLE and finite data that can be individually separate and distinct. ex. Number of students in a classroom, can't be one and a half student.
Continuous: Can take on any given within a given range. Fractions or decimals. Ex. height, 7.5 inches
What do these refer to when describing a distribution?
Shape Variability
Center
Outliers
Shape- Uniform, Symmetric, skewed right/left.
Use -ly words, Ex. Approx. symmetric, approx. uniform
Center- Mean/Median
Outliers- Gaps, potential outlier rule
Variability- Standard deviation, IQR
1. what is the rule about each quartile?
2. What is SD?
1. each quartile is 25% of the data?
ex. Q1= 25% Median/Q2= 50% Q3= 75% Q4= Max/ 100%
2. How far apart the values are from the center
What happpens to the center, shape and variability when you multiply or divide the constant/scores?
Center (mean or median): multiply or divided by constant
variability (SD or IQR) : multiply or divide by constant
what to do if you're not given the z score, mean, or a value?
find z-score closest to area of graph (percentage given as a decimal) and then plug it into the z-score equation to find the value or mean.
1. how to find the proportion of something:
2. percentage?
3. What are variables?
1. #/total amount = proportion
2. #/total amount x 100 = percentage
3. Variables are characteristics or attributes that can be measured (quantitative) or categorized (categorial) and can take on different values.
What does it mean for center when a distribution is left skewed?
Right skewed? Symmetric?
If the tail is pointing left then it's _____
left skewed- the mean is less than the median
Right skewed- mean greater than median
Symmetric- mean = median
If the tail is pointing left then it's left-skewed
What is a percentile?
2. What is cumulative relative frequency?
4. How to make cumulative relative frequency graph?
% of data is less than or equal to a certain value.
2. All the percentages (relative freqs.) added up to the end, and should equal 100% at the end.
4. y-axis is the cumulative relative frequency, x value is other data, highest value correlates to percent on table.
INV normal CDF on calculator- parameters (Greek symbols)
what do these refer to?
Area σ
μ Tail
Area- percent as a decimal σ- SD
μ- mean Tail- area of diagram shaded
if the area shaded is on the right, you use the rest of the graph's percentage (area) to determine value
if it's on the left, you use the value on left given for the graph's percentage (Area)
okay
How would you describe an association?
What is frequency?
and sum of frequencies?
and relative freq.?
knowing the x helps us to predict y. (Group A) is more likely to prefer _____ than (Group B)
Frequency is how often a value occurs in a data set, and the sum of frequencies is the total number of observations in the data set.
Relative frequency is a proportion or percentage that indicates the amount of times a value occurs compared to other data
what measures are least affected by skewness and outliers?
what measures are most affected by skewness and outliers?
1. IQR and median are least affected by skewness and outliers, so if a graph is skewed, you would use those measures.
2. Mean and SD are most affected by skewness and outliers, so you would use those for uniform and symmetric data
how to interpret if percentile is 28th?
ex. find percent using the rank of the value in the data (how high it is)
ex. 14/50 x 100= 28th
Utah is in the 28th percentile, which means 72% of states have a higher number of public colleges than Utah does, and 28 percent of states have 17 public colleges or less.
1. talking in concept of time: "faster is the lower number rather than the higher number. Greater is a number greater."
Fastest time- left side of graph
2. if there are two values, and you're trying to find between them, find the z-score, number on table that correlates for both then subtract percentages or proportions of both from eachother.
okay
what is a segmented bar graph?
a mosaic plot?
Association?
a segmented bar graph is stacked up bars to make 100%, bars are not connected.
In a mosaic plot, the boxes are connected and the width of the bars is determined by the amount of people in that group. ex. more people, wider box.
Association: if knowing the value of one variable helps us to predict the other variable.
describe what these symbols look like on a calculator (non-Greek symbols L1 Calc var stats)
mean
SD ( how would you describe SD as well)
Number of values Minimum Q1 Q3 Mediam
Max
mean= x with a line above it
SD- Sx (the ____ typically varies by SD from the mean of ________)
Number of values= n Minimum= minX Q1 Q3 Median= med
Max=maxX
what does z-score tell us?
how to find it?
z-score tells us how many SDs a value is below the mean, it does not TELL US THE VALUE.
z-score is: value-mean/ SD
positive score= above the mean
negative score= below the mean
going left- leave percentage found as is.
how do u find percentage?
ex.
USE TWO DECIMAL PLACES.
-1.03
column -1.0 on the negative side, row 3
NUMBER is 0.1539
percent : 0.1539 x 100= 15%