What is the level of measurement of gender?
Nominal
What is the first step when you create an R Markdown file and are ready to manipulate your datasets?
Load your packages using library ()
What is the cutoff of the p-value for a significant result?
p < 0.01 or p < 0.05 or p < 0.001
For T-test and ANOVA, the IV needs to be _____ measurement, and the DV needs to be _______.
IV: Nominal; DV: interval/ratio
For Chi-squared, the IV needs to be _____ (measurement level), and the DV needs to be _______. For Pearson's r, the IV needs to be _____ (measurement level), and the DV needs to be _______.
Chi-suqared: Nominal, Nominal
Pearson's r: interval/ratio, interval/ratio
What is the level of measurement of a 1 to 5 point scale?
Interval
str() is used for ________.
Examining the data type
What is confidence interval?
A range of values within which the population parameter is estimated to fall.
What is degree of freedom?
The number of observations that are free
to vary in calculating each statistic
Which statistical method requires your variables to be continuous variables?
Pearson's r
Why do we say a "true" probability sample only exists hypothetically?
Because we cannot guarantee that everybody has an equal chance to be selected in reality.
What are the typical data processing steps in R?
Read in data > Clean the data > Check the data > Format the data
What are type I and type II error?
Type I = false positive
Type II = false negative
What is the main difference between T-test and ANOVA?
T-test only tests two groups; ANOVA tests two or more groups.
How can we tell two variables are positively or negatively correlated, just by looking at the r value?
If it is negative, then it is negatively correlated; if it is positive, then it is positively correlated.
What is the difference between cluster sampling and stratified sampling?
Cluster: identify naturally occurring clusters, randomly select one cluster, and randomly sample from it.
Stratified: develop strata of your population, and your sample should reflect that strata
What is the difference between select() and filter ()?
In a normal distribution, what is the relationship between mean, mode, and median?
mean = mode = median
What is ANOVA actually testing (not null hypothesis)?
The difference between within-group variation and between-group variation
What is chi-squared actually testing (not null hypothesis)?
The deviation between observed and expected frequency.
Why do we want to aim for the "highest" level of measurement?
More statistical options; can always collapse down to lower levels.
If I want to detect "cup" and "creep", use the str_detect function. What should I put in the curly brackets {}?
str_detect(df, "c.{_____}p"
1,3
What is the difference between standard deviation and standard error?
Standard deviation = how much individual value deviates from the mean of the sample
Standard error = how much the mean of a sample deviates from the true mean of the population
When we compare three groups, why should we use ANOVA instead of several T-tests comparing two groups every time (1 and 2, 2 and 3, 1 and 3)?
It inflates the possibility of type I error.
What is phi (Φ) for?
The strength (effect size) of the association.