This measure ranges from 0 to 1 and tells you what proportion of variance in Y is explained by your model.
What is R-squared?
This occurs when two or more independent variables in your regression are highly correlated with each other.
What is multicollinearity?
When X1's effect on Y depends on the value of X2, you include this type of term, written in R as X1:X2 or within X1*X2.
What is an interaction effect (or interaction term)?
This unit free measure represents the percentage change in Y for a 1% change in X, and is obtained from a log-log model.
What is elasticity?
What technique reduces dimensionality by creating new variables that are linear combinations of original variables?
What is Principal Components Analysis?
In a simple linear regression, this is the value of Y when all X variables equal zero, represented by the point where the line crosses the Y-axis.
What is the intercept?
Calculate this for each predictor; values above 5 (or 10) indicate problematic multicollinearity.
What is Variance Inflation Factor?
This technique adds a penalty term to the loss function to prevent overfitting by shrinking coefficient values toward zero.
What is regularization?
What transformation's can you apply to an advertising variable to capture diminishing returns while ensuring the effect never turns negative.
What is log(AdSpending) or sqrt(AdSpending)?
This graph displays eigenvalues in descending order; the "elbow" where the curve levels off suggests how many components to retain.
What is a scree plot?
We assess significance of a slope by checking if this value is below 0.05, or if the confidence interval excludes this number.
What is the p-value and what is zero?
This violation occurs when the variance of residuals is not constant across all levels of X, creating a funnel shape in residual plots.
What is heteroscedasticity?
This regularization method uses an L1 penalty and can shrink coefficients exactly to zero, performing automatic variable selection.
What is Lasso?
In a quadratic model Y = β₀ + β₁X + β₂X², you get an inverted U-shape when β₁ has this sign and β₂ has this sign.
What is positive for β₁ and negative for β₂? (β₁ > 0, β₂ < 0)
In nonparametric regression, this parameter controls the size of the local window.
What is bandwidth?
When the number of parameters equals the number of observations, you can get this R-squared value, but you should never use it for forecasting because of this problem.
What is R-squared = 1, and what is overfitting?
What does multicollinearity inflate, while heteroscedasticity makes it unreliable but it doesn't bias coefficient estimates.
What are standard errors?
A positive interaction coefficient means these two variables do this to each other's effects, while a negative coefficient means they do this.
What is amplify (positive) and dampen (negative)?
In the model Ln(Y) = β₀ + β₁Ln(X) + β₂D + β₃Ln(X)×D, what is the elasticity when D=1, and this is the elasticity when D=0.
What is β₁ + β₃ (when D=1) and β₁ (when D=0)?
How does GAM differ from linear models and fully nonparametric models?
GAM uses flexible smooth functions (unlike linear), and GAM maintains additive structure (unlike fully nonparametric)?
Why does lm() fail when p > N?
The design matrix cannot be inverted and there are infinitely many solutions to this underdetermined system.
What calculation do we use to detect influential observations?
What is Cook's Distance?
What is a resampling technique used to assess significance of indirect effects because their sampling distribution is not normal?
What is data bootstrap?
What are the three sequential variable selection methods?
What are backward selection, forward selection, and stepwise selection?
Explain the bias-variance trade off in bandwidth selection.
What is small bandwidth results in low bias/high variance and large bandwidth results in high bias/low variance?