Correlations
Simple Linear Regression
Multiple Linear Regression
APA Reporting
Basic R Commands
100

Why standardize scores before running a correlation?

To get each scale in the same metric, units of SD

100

What are fitted values and do researchers generally want them to differ much from observed values?

They are the predicted values (the line of best fit) based on the data present. We don't want them to differ much because that indicates poor model fit

100

How would you interpret the following code: 

mlr1 <- lm(salary ~ pubs + years, profs)


Salary (the DV) is being regressed onto pubs and years (the IVs) using the dataset profs

100

Why report M and SD in text for a t-test?

Readers can understand which group had the higher/lower score, rather than simply knowing the two groups are statistically different.

100

How do you install packages in R?

install.packages("dplyr") --for any package

200

Which correlation is stronger: -.85 or .70?

-.85

200

Researchers run a SLR and find R= .33. Can they determine the correlation coefficient, and if so, how?

Yes, by taking the square root. r = .57

200

Researchers ran the code below. First answer what they are trying to do. Second, determine whether this is the "best" course of action. What could be another option? (Talk about type I error!)

lm1 <- lm(salary~pubs,profs)

summary(lm1)

lm2 <- lm(salary~years,profs)

summary(lm2)

They ran two separate SLRs using pubs to explain salary and then years to explain salary. They should run one MLR using both variables to explain salary to control for type I error, because every test run adds ~5% error rate of a false positive. So by running two tests, Type I error is ~10%.

200

When reporting any test in APA, you generally follow a similar format. What is the format?

character of the test(degrees of freedom) = t-statistic, p value, effect size. AKA: F(1,4) = 9.72, p = .003, n2p = .15.

200

What does the double colon mean in R? For example: knitr::include_graphics()

It means the package (knitr) doesn't need to be libraried in for you to use the specific function after the double colon

300

There is one component of an association that an r value cannot tell you. What is it and how do you determine that component?

Form or shape of the relationship; you will need to create a scatterplot

300

Researchers conduct an experiment to determine how stress influences sleep (in hours). They find the following unstandardized beta coefficient of -.331. How would they interpret this? (Discuss direction, strength, and the literal interpretation of the number.)

For every one unit increase in stress, sleep decreases by .331 hours. The beta is negative, indicating a negative relationship, and is likely moderate in strength.

300

In this MLR, how would you interpret the CIs for "years"?

2.5 %     97.5 %
(Intercept) 36331.727479 49836.9094
pubs         -204.332404   447.9374
years          -2.167321  1967.5461

They pass through zero, which means the beta for "years" may be zero, and therefore the slope is zero (AKA the variable is not a sig predictor)

300

Why report effect size?

It tells readers the practical significance, like how meaningful the difference between/within groups is. Also for meta-analytic purposes

300

Explain what is happening in this line of code: mydf <- data.frame(var1,var2,var3)

You are creating a new object called mydf by creating a dataframe of three variables

400

Researchers find the following: r = .11 and the 95% confidence intervals are -.05 and .22. What can the researchers conclude?

The CIs pass through zero, which means there's a chance this correlation could be zero. So, the correlation is likely not statistically significant

400

Researchers conduct an experiment to determine how stress influences sleep (in hours). They find the following standardized beta coefficient of -.85. How would they interpret this? (Discuss direction, strength, and the literal interpretation of the number.)

For every one SD increase in stress, sleep decreases by .85 SD. The beta is negative, indicating a negative relationship, and is likely moderate/strong in strength.

400

For the output below, interpret the significance of the two predictors, pubs and years, on the DV (income).

Coefficients:
            Estimate Std. Error t value      Pr(>|t|)
(Intercept)  43084.3     3099.2  13.902 0.00000000924
pubs           121.8      149.7   0.814        0.4317
years          982.7      452.0   2.174        0.0504

Neither predictor sig predicts income, though you could make the case that years might. We'll want to check those CIs!

400

Can Cohen's d be negative?

It technically can be, since sometimes your output will have a negative value. But you should report it as a positive because negative effects don't make as much meaningful sense.

400
How would you troubleshoot if mean(var1) didn't work? And the error code says object 'var1' not found

Tell it where to look by providing the dataframe and $ like this: mean(mydf$var1)

500

Would you expect the correlation between High School GPA and College GPA to be higher when taken from your entire high school class or when taken from only the top 20 students? Why?

The correlation will probably be higher if we compared the GPA of the entire high school class, because the range restriction in the top 20 students will lead to a smaller correlation.

500

Will there be a difference in the two outputs produced by the different codes and if so, what or why not? Also, answer what the measure of effect size if for a regression.

mylm_sd <- lm(scale(miles) ~ scale(energy), jogs) #scale functions centers and/or scales columns of a numeric matrix

summary(mylm_sd)

#compare to a zscore model

mylmz <- lm(zmiles ~ zenergy, jogs)

summary(mylmz)

There will be no difference in the codes because both are using standardized scores to run the model. The measure of effect for a regression is R2

500

Researchers ran a correlation testing how years since PhD and number of publications influences income. How should they interpret a multiple Rvalue of 0.5305?

53% of the variance in income can be explained by both years and pubs

500

How would you report the following output in APA style:

Pearson's product-moment correlation

data:  corr$time and corr$pubs
t = 3.1393, df = 13, p-value = 0.007832
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2175819 0.8746896
sample estimates:
      cor
0.6566546

There was a significant, positive correlation between time and pubs, r(13) = .66, p = .008. 

500

If you can't remember how to call in a file, what can you do (besides looking it up on a search engine!)

Find the file in your R files, click on it, import dataset, copy and paste the code into a chunk

M
e
n
u