R
R2
Regression1
Regression2
Papers2
100

How many stars represents a good P-Value

3 (***)

100

What does the update() function do? (what does it update?)

update a model (I.E. linear)

100

a first order polynomial fit of the data is what shape?


A Line
100

if the Regression Coefficient is 0.2 what does that tell us about the variables? 

not correlated / moderatly weak

100

(T/F) As sample size increases more variation can be modeled

True

200

function for plotting a regressed line in R

abline()

200

How do you access data from a csv file

read.csv()

first you must download the csv into the project

200

are higher order polynomials always better

No

200

How do you transform a plot whos best fit line is exponential

Log the data

200

How many variables can predict question scores on stack overflow

(6, 7, 16, 12, 5)

7

300

code for printing out a pdf of a plot

dev.print(device=pdf, "name")

300

Code to remove null data values

na.omit(df)

300

if the Regression coeficcient is -0.9 what does that tell us about the data?

They have a strong negative correlation

300

a great P-value is less than what number?

0.001

300

what does a clustering model do?

clusters data into larger groups

400

Code to subdivide the plotting area into four regions

(hint: used for diagnostic test)

par(mfrow=c(2,2))

400

What is wrong with this code?

lm.fit=lm(medv, lstat)

lm.fit=lm(medv~lstat)

400

why do we use regression on data sets?

to predict data, and identify correlations

400

y=0.28x + 9.916

x=0.3

what is y?

9.95  or 10.00 or 10.05 or 10.10

10

400

The Full meaning of CDR's

hint: Used in the Sri Lanka Paper

Call Detail Records


500

how do you get confidence intervals

predict(data.frame(c(30)), interval="confidence")

500

What is wrong with this code? 

abline(lm.fit,lwd=3)

abline(lm.fit,lwd=3,col="red")

plot(lstat,medv,col="red")

plot(lstat,medv,pch=20)

plot(lstat,medv,pch='+")

plot(1:20,1:20,pch=1:20)

plot(lstat,medv,pch="+")

500

A term used to describe the case when the predictors in a multiple regression model are correlated

Multicollinearity 

500

Description of Heteroscedasticity 

Linear Regression with varying error terms

500

(T/F) a low R value means that there is no statistical relationship between variables

False