When the training score is better than the testing score?
What is overfit.
What generic metric do we optimize when fitting a machine learning model.
Loss Function
The success of this algorithm can be affected by outliers.
What is K-Means Clustering
This distribution models single, binary events.
What is the Bernoulli Distribution
One particular/ defining characteristic of Time Series data, distinct from orthogonal data.
What is Autocorrelation
The training data in a Random Forest is compiled via this method.
What is bootstrapping
TP / TP + FN
What is Sensitivity
This metric can be used to evaluate intra-cluster cohesion and inter-cluster separation.
What is silhouette score.
This distribution models the number of successes in a fixed time period.
What is the Poisson distribution.
This algorithm can use exogenous variables to model a time series.
What is VAR (vector auto-regression models)
This type of model uses a collection of weak learners fit iteratively.
Boosting
This algorithm finds the optimal Beta coefficients in a Regularized Linear Regression Model.
What is Gradient Descent
This clustering evaluation metric will increase as the number of clusters increases.
What is inertia score.
This statistical procedure determines if the mean difference between two series of data is equal to zero
What is a Paired T-test
The statistical test to find the "d" hyper-parameter in ARIMA models.
What is the "Augmented Dickey Fuller Test"
The most uncommon type of missing data:
What is "Missing Completely at Random"
The GLM link component for Linear Regression?
What is the Identity Function
This method uses information from one machine learning model and uses it in another, unrelated model.
What is transfer learning
This type of inference uses new information to update our understanding of event probability.
What is Bayesian inference
The null hypothesis for the above test.
What is "Not stationary"
This Tree-based model is the most uncorrelated.
Extra Trees
The Loss Function for Logistic Regression
What is Log Loss/ Binary Cross-Entropy
The linear relationship between two or more Principal components.
Orthogonal
This proposes a range of plausible values for an unknown parameter.
What is a confidence interval.
This algorithm can predict sudden shocks in a time series.
What is Moving Average