In a large randomly selected observational study, there was a strong association between reading and having a high grade in English. Can we say reading causes a high grade in English?
NO: correlation does not imply causation
For a Scatterplot, the _____ variable is plotted on the x-axis, and the ________ variable is plotted on the y-axis
What is the Explanatory (independent/input) variable, and Response (dependent/output) variable?
Correlation (the correlation coefficient) is a number that measures the direction and strength of a linear association between two quantitative variables.
r is only used in ___________ relationships
What is linear?
What does LSRL stand for?
What is Least squares regression line?
What visual model is used to show association between categorical data?
What is a segmented bar graph (or a side-by-side bar graph or a mosaic plot)
Once we make a scatterplot, we describe the association by telling about: 1.__2.__3.___4.___
Direction: + or - slope?
Unusual Features: outliers, clusters, subgroups?
Form: straight, curved, no pattern, other?
Strength: how much scatter {how closely points follow the form} weak, moderate, strong, very strong
Correlation describes the ____ and ____ of the linear relationship between two quantitative variables
What is strength and direction?
Actual minus predicted is the equation for...
What is Residual?
When shown a segmented bar graph, how can you tell there is NO association?
If the bars are the same heights/proportions
how is the correlation affected by changing units, ex: measuring in minutes rather than hours?
There is no affect on correlation when you change units.
What is the square of the correlation coefficient called?
What is the coefficient of determination
The hat (^ ) symbol above y stands for this word
What is "predicted"
If data is presented in this way, you can be assured it is a two-categorical data set
What is a two-way table?
If an r value is 0.72, what is the strength of the linear relationship?
What is strong
The correlation coefficient's value ranges from __ to ___
-1 to +1
A regression equation is a "good fit" for the data if the residual plot looks this way
What is random?
To check for association between two categorical data, we look for a difference in what?
What is the proportions of each response variable per explanatory variable?
How do you "define the variables" in a LSRL?
What is, you change the variables x and y to the words.
EX: predicted pain=7-0.2(days since surgery)
No correlation (when r = 0), means that knowing one variable gives you ______.
no information about the other variable
These "outliers" are outliers on the x-axis, and have a strong affect on the slope and the correlation coefficient
What are influential points?