Histograms & Box Plots
Shape, Center, and Spread
Two Way Frequency Tables & Relative Frequency
Scatter Plots, Correlation Coefficients, and Residuals
Linear Regression, slopes, and y-intercepts

The section where a majority of the data occurs, lies between Q1 & Q3.

The Interquartile Range (IQR)


This is when you add up all the data points and then divide by the total number of data points. 

The mean or the average 


Two way frequency tables are used to best represent this type of data. 

Categorical data 


A straight line that minimizes the distance between it and some data. 

Best fit line


When calculating the linear regression line on Google sheets, what function do we use? 



If Q1 = 82 and Q3 = 105, what is the value of the IQR? 



Find the MEDIAN of the following data set: 

5, 19, 62, 17, 12, 4, 4, 8, 10, 1, 54, 49. 



The TOTAL COUNTS of the data can be located in this section of a frequency table. 

Marginal Frequencies 


You notice that the y values in a data set are increasing. What does this tell you about the correlation coefficient? 

It's positive 


f(2)=3 and f(-4)=6. What is the slope? 

-1/2 (or -0.5)


A graph used to represent the frequency distribution of a data set, often classifying data into various “bins” or “range groups” and count how many data points belong to each of those bins.



Is the following data set unimodal, bimodal, or multimodal? 

1, 2, 5, 7, 5, 6, 8, 8, 9, 20, 21, 22, 22, 24, 25 



What is it called when you take the entries in each row of the table and divide by the total for that row? 

Relative frequency of rows


This is the distance between the actual data and the best fit line. 



f(-3)=5 and m=15. What is the y-intercept? (b-value)



What are the full steps for creating a box plot? 

1. Find the lowest and highest data points. 

2. Find the median.

3. Find the median of the 1st half (Q1) and the median of the second half (Q3).  

4. Draw the IQR box and whiskers.


If the data is symmetric, what do we know about the mean and the median? 

They are equal. 


What is the difference between a two-way frequency table and a relative frequency table?

A two-way frequency table has whole numbers. A relative frequency table has decimals or percentages. 


You calculate the correlation coefficient of a data set to be -0.75. What do you expect the scatter plot to look like?

Almost linear negative correlation, points somewhat close together and decreasing 


The x-values of a data set represent age and the y-values represent height. Describe the slope. 

height/age (how height changes based on age)


What percent of the data is represented by the IQR box and what percent of the data is each of the whiskers? 

IQR Box = 50% of data

Each whisker = 25% of data 


If a data distribution is SKEWED LEFT, what is the relationship between the mean and the median? 

The median is larger than the mean. 


You collect some data regarding male/female and sport preference. In your study, 35 males and 32 females participated. 10 males prefer basketball, 25 males prefer football. 22 females prefer basketball, 10 females prefer football. 

What is the relative frequency of females who prefer basketball to the total number of students? 



You notice on a residual plot that all the points are clustered towards the x axis. What does this tell you about the correlation coefficient? 

That it is close to 1 


A local dunkin donuts is collecting data on iced coffee sales vs. temperature. They find the linear regression model to be y=3.2x+13. Interpret the meaning of the numbers 3.2 and 13 in the context of this problem. 

3.2 - as temperature increases, iced coffee sales increase by 3.2

13 - the original amount of iced coffees sold on the first day