Naked Statistics
Logistic Regression
Regression
SEM
Factor Analysis
100
What are the two types of errors and what do they represent, both from a practical and a statistical perspective?
Type 1: Finding something that’s not there (worse in psychology) (false positive); "more important" in stats - 0.05 Type 2: Not finding something that is there (worse in medicine); allows for more mistakes in stats (20%) because power is set at 80%
100
What is the key difference between linear regression and logistic regression?
DV is dichotomous/categorical
100
What is the difference between Standard Regression and Hierarchical Regression?
In Standard Regression, you enter all of your predictors at the same time, because you have no reason to believe one affects the outcome more. In Hierarchical Regression, you enter different variables in different steps. In this way, you are testing the effect of one set of variables "above and beyond" another set of variables.
100
Main difference between SEM and factor analysis is that ...
SEM handles both measured and latent variables
100
What is the difference between internal and external validity, and why is there a trade off between them?
Internal Validity describes how well one variable causes another variable. External Validity describes how well your findings apply to the real world. As internal validity increases (by controlling for confounds, often in a lab setting), external validity decreases (because the real world does not fit perfectly into a lab setting!)
200
Mention three cons of probability and what they mean ...
- assumes events are independent, when sometimes they aren’t - clusters happen - prosecutor’s fallacy - regression to the mean - statistical discrimination
200
What are the possible variable selection methods in logistic regression?
Standard, hierarchical, statistical
200
In a Hierarchical Regression, which variables are entered first?
- the variable(s) you want to control for - the variable(s) you are interested in
200
What are 2 types of variables and what do they represent in SEM? (not measured vs. latent; other classification ^_^)
Exogenous = independent variables Endogenous = dependent variables in at least one equation (they may be independent variables in other equations in the system)
200
What is the difference between Exploratory Factor Analysis and Confirmatory Factor Analysis?
Exploratory Factor Analysis is used early on in the research process. It helps to generate hypotheses, and to summarize and describe data. Confirmatory Factor Analysis is used with SEM and is used to identify predicted latent constructs.
300
Mention three statistical biases and define them ...
Selection Bias - when you select only people from a certain category ... Publication Bias - when only studies with "positive" results get published ... Recall Bias - when people tend to recall things differently based on their experience/present ... Survivorship Bias - when you have only the people who survived in a longitudinal study Healthy User Bias - people who take vitamins and are healthier but they might be healthy because they just pay more attention in general to health etc ...
300
A linear least squares regression equation can be solved with a formula whereas a logistic regression equation is solved _____. The equation is the probability of _____ divided by the probability of _____.
Iteratively; being in one group; being in the other group.
300
What are the three research questions we looked at for multiple regression?
Degree of Relationship Importance of IVs Contingencies among IVs
300
Rather than asking for definitions, we have some fill in the blanks for SEM ... :) Essentially, SEM combines ______ with ______ .You end up with a ________ model and a _______ model. Also, SEM is usually considered to be a _______ tool rather than an _______ procedure.
factor analysis regression measurement structural confirmatory exploratory
300
What is one main difference between Factor Analysis and Principal Components Analysis?
In Factor Analysis, factors are said to "cause" variables. In Principal Components Analysis, variables group together to decrease the number of variables into components (or what we called factors in class)
400
Can you remember some limitations (and what they mean) of regression mentioned in Naked Stats? ... :)
· Using regression to analyze a nonlinear relationship Sometimes there is no line of best fit for data · Correlation does not equal causation Regression analysis only shows a relationship between variables · Reverse causality The variable you think is the IV may in fact be the DV · Omitted variable bias Explanatory variables that have a significant influence on the DV may not be accounted for · Highly correlated explanatory variables (multicollinearity) It can be difficult to determine individual effects of explanatory variables that are highly correlated with one another · Extrapolating beyond the data Results are valid only for a population similar to the sample Problem for regressions without normal distribution, representative sampling · Data mining Including too many variables can compromise results Only include variables with a theoretical justification
400
Name the 2 more flexible aspects of LR
Doesn’t need all the assumptions ; can use a mix of predictors.
400
What does Model 2 in a hierarchical regression contain? And what does it tell you? (What stats are important for this?)
Model 2 has all the variables, and by looking at F-change we know the added variance of the variables that are in this model, but not in the first one. Model 2 can give us the information we're interested in about the variables that we wanted to look more attentively.
400
The estimates in the SEM are conceptually related to ... ? But the main difference is that ... ?
=> conceptually similar to regression coefficients - the main difference between estimates and traditional regression coefficients => estimates not compromised by random measurement error through the use of the measurement model
400
Name two ways to increase reliability
- standardize conditions of measurement - run pilot tests to ensure clear question wording - train observers and coders - code data correctly
500
Name three different types of experiments from the last chapter of Naked Statistics and give a short description of each :)
Randomized, controlled experiment Natural Experiment Non-equivalent control Difference in Differences Discontinuity Analysis
500
Come up with a hierarchical logistic regression question.
...
500
Name example use of an experiment using Standard Regression and name one example of an experiment using Hierarchical Regression. Make sure to include the relevant statistics and tell us all what they mean!
...
500
What two indices are used for the goodness of fit in SEM and how are they scaled regarding the evaluation of the goodness of fit?
- the Comparative Fit Index (CFI) and the RMSEA are two of the highly recommended indices. -they are scaled differently, with high values for the CFI indicating good fit (.95 has been offered as a threshold) and low values for the RMSEA indicating good fit (.08 and .05) have been proposed as cut-offs for indicating good fit.
500
What is one example of a situation in which you would use Factor Analysis or Principal Components Analysis?
You probably explained something about personality tests! ...or you were creative, which is fine too!