The topic of this data.
What is school success rates?
There are this many data points in the data set.
What is 1000?
The typical value of grad %>% ggplot(aes(x=act)) + geom_histogram(bins=16) tells us this about this group of students.
The students are above the national average (of around 19 points or so).
This is how to look at the data.
What is > grad ?
> grad %>% ggplot(aes(x=p_income)) + geom_histogram()
Show me a histogram!
This is a categorical variable.
What is parental level of education?
This is the mean SAT score.
What is 1999.9 (or 2000)?
The data that conveys a diverse population.
What is the parents education?
This is the collection of students that would be on honor roll in high school.
What is filtering on GPA?
> grad %>% ggplot(aes(x=yrs_to_grad, y=un_gpa,group=yrs_to_grad)) + geom_boxplot()
SHOW ME A BOXPLOT!
The question that this data could answer.
What is
The distribution and description of the variable is this.
What is [the shape, typical value and spread] of this data?
The conclusion from this graph is this.
grad %>% ggplot(aes(x=hs_gpa, y=un_gpa,group=hs_gpa)) + geom_boxplot()
What is the conclusion?
The two (or more) data sets separated by a variable
What is two variables with filtered data?
The graph that shows a correlation between two variables.
A two variable graph!
The supporting/contradicting evidence for this data (find evidence and discuss!).
What did you find?
The anomalies (which anomalies) in this data could indicate this.
What is the outliers?
The students in this group are this.
What do we learn?
This is the way to present percentages of data
What is a frequency table?
This multiple variable graph show this.
What is a multi-variable graph?
Other data sources that can correspond, there are ups and downs of this other data set!
Show me the other data sources!
The overarching distributions of the data (with evidence)
The impacts of 'this' variable on 'that' variable.
What is the correlation?
The simulation of this variable would demonstrate that
How might one create a simulation of percentages?
The spread of this data informs us of this.
What does it say?