Graphs
Bivariate Data
Least Square Regrssion Line
Categorical data: Frequency tables
Miscellaneous
100

How do you find inner quartile range?

Subtract Q1 from Q3

100

What two types of variables are included in a bivariate data set?

Explanatory and response variables.

100

What is the y variable?

The response variable. 

100

What must be true about the shape of a symmetric distribution?

The distribution's mean is approximately equal to the median. (not that it's normal or unimodal)

100

What is the median?

Point that divides the measurement in half

200

How do you determine if a point is influential?

1. Run Regression

2. Remove point from data set

3. Rerun regression

4. If y-hat and r-sq changed significantly, then the point is influential 

200

What are two commonly used measures to summarize the relation between two variables?

Scatter plots and Correlation coefficient

200

What is the x variable?

The explanatory variable. 
200

What is a joint frequency?

The frequency of two categories , one from each of the two classification criteria occur together.

200

What are the attributes of a histogram?

Contains intervals of values; measured in frequencies

300

When the distribution is skewed right, is the mean greater than or less than the median?

Mean is greater than the median

300

What does the correlation coefficient measure?

The strength and direction of a relationship.

300

What is a residual? How is it calculated?

The residual is the distance away the actual value is from the least-squares regression line. The residual is calculated by subtracting the predicted value from the actual value. (residual = actual - predicted)

300

What is marginal frequency?

The frequency that contains the total of a category.

300

Find the x-value of a data point at the 95th percentile in a distribution with a mean of 85 and standard deviation of 4.

x=91.56

400

What is a back to back stemplot?

Plot that can only be used to compare two data sets

400
What two methods do we use to determine if a linear model is appropriate?

Residual plots and coefficient of determination. 

400

What 4 pieces of information can you extract from a regression output table?

explanatory variable

slope

y-intercept

r-sq

400

What is conditional relative frequency?

Frequency of one category given that the other category has occurred.

400

What is used to denote a population standard deviation? 

Greek symbol sigma

500
How do you determine the shape of a boxplot?

Examine the tails and median.

500

What aspects of a linear regression are affected by either shifting or scaling? (hint: consider slope, y-intercept, r, and r-sq)

Y-intercept: shifting and scaling

slope: scaling

r: neither

r-sq: neither

500

How do you interpret r-sq? 

R-sq is the percent of variation in the response variable that is account for by the linear model relating the response variable to the explanatory variable. 

500

What aspects of univariate data is affected by shifting and scaling? (SOCS)

Center (mean + median): shifting and scaling

Spread (range, IQR, SD): scaling

Shape: neither

500

What is a z-score?

The number of standard deviations a value is away from the mean.