See the Data
Be the Data
Infer this!
Gotta Collect 'Em All
Show Me the Data!
To Reject or Not to Reject
Margin of Terror!
Let's Regress this Situation
100

This is the process of gathering raw data from various sources.

What is data collection?

100

This branch of statistics utilizes measures of center, measures of spread, and some data visualizations to summarize a dataset. 

What is descriptive statistics?

100

This sampling method gives everyone an equal chance of being selected. 

What is random sampling?

100

This graph is utilized to show counts for each category on the y-axis and categories on the x-axis.

What is a bar chart?

100

This symbol should never be in an alternative hypothesis. 

What is =?

100
This value is half of the distance of the confidence interval. 

What is the margin of error?

100

This assumption of correlation and regression can be checked using a qq-plot. 

What is normality?

200

This term describes masking personal data by removing exact location, but still providing a general idea of location for example. 

What is de-identification?

200

This measure of center is typically not found when using the .describe() function. 

What is the mode?

200

This sampling method requires the selection of every kth person for participation. 

What is systematic sampling?

200

This visualization is most helpful when identifying outliers.

What is a boxplot? 

200

This is the null and alternative hypothesis for a test of whether two categorical variables are related. 

What is

Null: No association

Alternative: Association

200

This is the symbol representing the point estimate when you are calculating a 95% confidence interval to estimate the true proportion of Clemson students that play Fortnite. 

What is 

hat{p}

200

When checking your residual plot, you hope to see no pattern, if you do, this might be violated. 

What is homoscedasticity?

300

This term describes the type of study from which we can make causal claims. 

What is an experiment?

300

This branch of statistics describes taking information about a sample and extending it to the population. 

What is inferential statistics?

300

This sampling method utilizes individuals that are easily accessible for the researcher. 

What is convenience sampling?

300

This visualization is used to show the distribution of numerical variables and is often confused with its qualitative counterpart. 

What is a histogram?

300

When compared to each other, these two values help us make a decision about whether or not to reject the null hypothesis. 

What is level of significance and p-value?

300

This describes what it means when we say, "We are 95% confident..."

What is, if we take 100 samples of the same size and build 100 confidence intervals from these unique samples, about 95% of them will capture the population parameter, roughly 5% won't?

300

This graphic can help us visualize if two quantitative variables are linearly related. 

What is a scatter plot? 

400

This term describes data that was collected directly by the research using the data. 

What is primary data?

400

This term refers to the true population value that we are interested in. 

What is parameter?

400

This sampling method utilizes natural groupings and then selects whole groups for the sample. 

What is cluster sampling?

400

These two graphs are not as commonly used today as they cannot handle large amounts of data. 

What are dot plots and stem-and-leaf plots?

400

This test should be utilized when trying to determine if the true average score on overcooked for Clemson students is higher than the true average score on overcooked for USC students. 

What is a two-sample t-test?

400

This value is always the contained with in the confidence interval and in our class it is always the center. 

What is point estimate. 

400

This graphic allows us to see the strength of a linear relationship as a numeric value that is color coded. 

What is a heat map?

500

When discussing these terms we learned that how the data is organized, or unorganized, can affect the processing of the data. 

What is structured vs unstructured data?

500

This term says that as you increase the number of trials, the observed probability will approach the true probability. 

What is the law of large numbers? 

500

This sampling method divides the population into groups before drawing a sample from each group. 

What is stratified sampling?

500

This graph is used to view the relationship between two quantitative variables.

What is a scatter plot?

500

This describes what a p-value actually means.

What is, the probability of finding a sample as extreme, or more extreme, than what we observed, given the null hypothesis is true. 

500

This is the generic interpretation of a 95% confidence interval.

What is "We are 95% confident that the true parameter is between the lower bound and upper bound. 

500

This is the correct model for the following output:

What is 

hat{y}=-5872.09+50.15x