Concepts in Biostats
T-tests, ANOVA, and correlation coefficients
Linear Regression
Logistic Regression
100

Proportion of events/successes divided by total number of trials

Probability

100

Test for comparing means between two independent groups

Independent-samples t test

(ANOVA also accepted)

100

Linear regression model for binary outcomes

Linear Probability Model

100

Exponentiated regression coefficients in logistic regression

Odds Ratios (ORs)

200

The probability of getting a test statistic that or more extreme assuming the null hypothesis is true

P-value

200

Hypothesis that the means in both groups is the same in an independent samples t test

Null hypothesis

200

Measure of amount of variance in Y (dependent variable) explained by all X (independent variables) in the model

R-squared

200

Extensions of the linear model to account for non-continuous outcomes

Generalized Linear Models

300

Range of values that, if calculated repeatedly, would contain the true population parameter 95% of the times

95% confidence intervals

("95%" necessary for points)

300

Range of correlation coefficients

-1 to 1

300

Are expected to be normally distributed with constant variance and mean = 0 for valid inference in linear regression

Residuals

300

Link function in logistic regression "transforming" the right-side of the equation

Logistic, logit, log-odds

400
Mathematical statement that says that distribution of the sample means (over successive samples) approximates a normal distribution regardless of the distribution of the population parameter

Central limit theorem (CLT)

400

Statistical tests conducted after an overall significant test in ANOVA to identify the specific group(s) with different means from others

Post-hoc tests

400

Term that allows for the effect of one predictor on the outcome to be modified by another predictor

Interaction term

400

Test that evaluates whether the full model (with all independent variables) fits the data better than a null model with no predictors (intercept-only).

In other words, this model evaluates whether all regression coefficients are = 0.

Omnibus Test (Likelihood-ratio test)


500

Standard deviation of the sampling distribution of the sample means

Standard error (SE) 

(or, in this case, Standard Error of the Mean)

500

Tests that are used when assumptions of t tests or ANOVA are not met (e.g., non-normality)

Non-parametric tests (e.g., Mann-Whitney or Kruskal-Wallis)

500

Situation when the outcome is not well represented by a linear model (e.g., if the outcome increases quadratically with the predictor)

Non-linearity 

(Model misspecification)

500

Equation of the effect of age on high cholesterol using logistic regression (population)

log-odds(Cholesterol) = Beta0 + Beta1*Age + Error