What does read.csv do? How do you format it?
Reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file.
Example: data <- read.csv("file.csv")
What is the pipe operator?
%>%
Bonus: what do you use the pipe operator for?
colnames(data)[colnames(data)=="oldname"] <- "newname"
What is ANOVA?
Analysis of Variance; differences between the means of two or more groups or treatments
What does this error mean?
Error in install.packages : object 'ggplot2' not found
Library ggplot2 not downloaded or loaded.
install.packages(ggplot2) or library(ggplot2)
What does write.csv do? How do you format the code?
Writes a data frame to a file in CSV format.
write.csv(data, "file.csv")
How do you access the library Minfi?
library(Minfi)
What does this code do?
data$new_variable <- data$variable1 + data$variable2
Create new variables
How do you use the ANOVA function?
model <- aov(variable ~ group, data=data)
What does this error mean?
Error: unexpected string constant in "c("Group 1," "Group 2""
The extra " at the end of the line makes the line of code read as a string.
How are colnames and rownames?
colnames: Gets or sets the column names of a matrix or data frame
rownames: Gets or sets the row names of a matrix or data frame
Which function do you use to select only column 1 and column 2 from a given data set?
select(column1, column2)
What is rbind?
Binds two data frames by rows (rbind)
ex: rbind(data1, data2)
How do you write a linear model function?
model <- lm(y ~ x1 + x2, data = data)
What does this error mean?
Error in maen(c(1, 7, 13)) : could not find function "maen"
maen is misspelled and not registered as a function (mean is the correct function)
Which function accesses a specific variable (column) in a data frame?
data$variable
What does the summarise() function do?
Calculates the mean of column1 and column2 in the selected data.
Each step feeds directly into the next, creating a workflow that's easy to read and understand.
What is cbind?
Binds two data frames by columns (cbind)
ex: cbind(data1, data2)
What does the cor function do?
Computes correlation between two variables
What does this error mean?
Warning in mean.default(gender): argument is not numeric or logical: returning
The variables being discussed are not converted to integers.
What does the table function do? How do you use it?
Creates a contingency table of the counts at each combination of factor levels.
Example: table(data$variable)
How do the merge and subset functions work?
DOUBLE POINTS
subset(): creates a subset of a data frame based on the defined parameters
Ex: subset(ECHO_data, select=CBCL_older)
merge(): combines two data frames based on a variable in common
Ex: echoDataWide <- merge(data, echoDataWide, by = 'ID', all=T)
Write a basic line to rename a variable (hint: use colnames).
colnames(data)[colnames(data)=="oldname"] <- "newname"
How do you use the cor function?
cor(data$x, data$y)
What does this error mean?
Error in `[.data.frame`(dat, dat$Age == 7) : undefined columns selected
The data from which columns are being selected may not have 7 columns total to select from. Check your data set using View(data).