Proportion of events/successes divided by total number of trials
Probability
Test for comparing means between two independent groups
(ANOVA also accepted)
Linear regression model for binary outcomes
Linear Probability Model
Exponentiated regression coefficients in logistic regression
Odds Ratios (ORs)
The probability of getting a test statistic that or more extreme assuming the null hypothesis is true
P-value
Hypothesis that the means in both groups is the same in an independent samples t test
Null hypothesis
Measure of amount of variance in Y (dependent variable) explained by all X (independent variables) in the model
R-squared
Extensions of the linear model to account for non-continuous outcomes
Generalized Linear Models
Range of values that, if calculated repeatedly, would contain the true population parameter 95% of the times
95% confidence intervals
("95%" necessary for points)
Range of correlation coefficients
-1 to 1
Are expected to be normally distributed with constant variance and mean = 0 for valid inference in linear regression
Residuals
Link function in logistic regression "transforming" the right-side of the equation
Logistic, logit, log-odds
Central limit theorem (CLT)
Statistical tests conducted after an overall significant test in ANOVA to identify the specific group(s) with different means from others
Post-hoc tests
Term that allows for the effect of one predictor on the outcome to be modified by another predictor
Interaction term
Test that evaluates whether the full model (with all independent variables) fits the data better than a null model with no predictors (intercept-only).
In other words, this model evaluates whether all regression coefficients are = 0.
Omnibus Test (Likelihood-ratio test)
Standard deviation of the sampling distribution of the sample means
Standard error (SE)
(or, in this case, Standard Error of the Mean)
Tests that are used when assumptions of t tests or ANOVA are not met (e.g., non-normality)
Non-parametric tests (e.g., Mann-Whitney or Kruskal-Wallis)
Situation when the outcome is not well represented by a linear model (e.g., if the outcome increases quadratically with the predictor)
Non-linearity
(Model misspecification)
Equation of the effect of age on high cholesterol using logistic regression (population)
log-odds(Cholesterol) = Beta0 + Beta1*Age + Error