Quantitative Data
Categorical Data
Normal Curve
Linear Regression
Analyzing Linear Regression
100

This measure of center is more resistant to outliers than the mean.

Median

100

What measure do we use to compare sets of categorical data?

Relative Frequencies

100

What is the definition of a percentile?

The percent of data BELOW a certain value

100

Two part-er What is the equation for a residual? How do I interpret a residual plot?

Residual=observed-predicted.


Look for patterns in a residual plot. Pattern=bad for linear model.

100

What is the generic formula for interpreting the slope? The y-int?

"for each additional x-unit, the y-unit increases on average by the slope."


At x=0, the y-unit is "y-int".

200

What measure do I use to compare two sets of quantitative data?

Z score
200

Name one key word, phrase, or method to distinguish between joint frequency and conditional relative frequency? What is the denominator in joint relative frequency vs conditional relative frequency?

"Of those who" OR Draw a line at the action verb.


Joint relative: divide by grand total

Conditional relative: divide by category total

200

This calculator command can be used to find the area under a normal distribution and above an interval.

normalcdf

200

The fraction of the variables in the values of y that is explained by the LSR of y on x. This is the interpretation for what value?

r2

200

R2=0.5 and the slope is -4.2. What is the direction and strength of this correlation?

r=-0.71. It is negative and strong.

300

Name 2 graphs we can use to represent quantitative data

Box plot, dot plot, stem plot, histogram

300

True or false. Phone number is an example of categorical data. Explain.

True. You cannot take an average phone number. They categorize people by numbers.
300

This rule helps to determine if data is normally distributed by checking the number of observations within each interval. What are the intervals?

Empirical Rule. 68-95-99.7 rule. The intervals are -1/1 standard deviations, -2/2 standard deviations, and -3/3 standard deviations.

300

Name three ways I can determine if a linear model is good fit for a dataset.

Look at r, r2, and residual plot

300

is there a good fit for a linear model:

No. Cone shape.

400

When asked to describe the distribution of quantitative data, what information must I provide?

S hape
O utliers
C enter
S pread

400

Name 2 graphs we can use to represent categorical data.

Bar chart, pie chart, segmented bar chart

400

The trees in Newark have an average height of 17 ft with a standard deviation of 10 ft. How tall must a tree be to be in the top 10% of tree heights? In the bottom 10%?

29.8 ft

400

What is the regression line? What is r

y-hat = -3.01x +576.68

r=0.89

400

What is the goal of linear regression?

To see if we can establish a linear relationship between two variables. We can use this to make predictions about unknown data, and to find an average relationship.

500

For what distributions do I use mean and standard deviation to describe the center and spread?

Symmetrical

500

True or false. The number of games won by basketball team is higher than baseball team. Explain.  

False. The percentage is higher, but we cannot make conclusions about number of games since no "n" is given.

500

The average free throw percentage in the NBA is 0.7 with a standard deviation of .08. What percent of players shoot above .82?

6.6%

500

What is the response variable in linear regression. Why is it called the response variable?

The y variable. Because it "responds" to the x input. It's also known as the dependent variable since it depends on x.


500

What conclusion can you draw about the relationship between x and y:

There was a transformation of data done. It now has a strong linear relationship.