Two Categorical
Exploring Relationships Between Variables - Scatterplots and Association
Correlation
Linear Modeling and Regression
100

In a large randomly selected observational study, there was a strong association between reading and having a high grade in English. Can we say reading causes a high grade in English?

NO: correlation does not imply causation

100

For a Scatterplot, the _____ variable is plotted on the x-axis, and the  ________ variable is plotted on the y-axis

What is the Explanatory (independent/input) variable, and Response (dependent/output) variable?

100

Correlation (the correlation coefficient) is a number that measures the direction and strength of a linear association between two quantitative variables.

r is only used in ___________ relationships

What is linear?

100

What does LSRL stand for?

What is Least squares regression line?

200

What visual model is used to show association between categorical data?

What is a segmented bar graph (or a side-by-side bar graph or a mosaic plot)

200

Once we make a scatterplot, we describe the association by telling about: 1.__2.__3.___4.___

Direction: + or - slope?

Unusual Features: outliers, clusters, subgroups?

Form: straight, curved, no pattern, other?

Strength: how much scatter {how closely points follow the form} weak, moderate, strong, very strong

200

Correlation describes the ____ and ____ of the linear relationship between two quantitative variables

What is strength and direction?

200

Actual minus predicted is the equation for...

What is Residual?

300

When shown a segmented bar graph, how can you tell there is NO association?

If the bars are the same heights/proportions

300

how is the correlation affected by changing units, ex: measuring in minutes rather than hours?

There is no affect on correlation when you change units.

300

What is the square of the correlation coefficient called?

What is the coefficient of determination

300

The hat (^ ) symbol above y stands for this word

What is "predicted"

400

If data is presented in this way, you can be assured it is a two-categorical data set

What is a two-way table?

400

If an r value is 0.72, what is the strength of the linear relationship?

What is strong

400

The correlation coefficient's value ranges from __ to ___

-1 to +1

400

A regression equation is a "good fit" for the data if the residual plot looks this way

What is random?

500

To check for association between two categorical data, we look for a difference in what?

What is the proportions of each response variable per explanatory variable?

500

How do you "define the variables" in a LSRL?

What is, you change the variables x and y to the words.

EX: predicted pain=7-0.2(days since surgery)

500

No correlation (when r = 0), means that knowing one variable gives you ______.

no information about the other variable

500

These "outliers" are outliers on the x-axis, and have a strong affect on the slope and the correlation coefficient

What are influential points?

M
e
n
u