GLM
Generalized Linear Model
This test checks for stationarity
Augmented Dickey-Fuller Hypothesis Test
By making decisions from a subset of features from a bootstrapped dataframe, one will discover that life really is like a box of chocolates.
Random Forest Gump
(Random Forest + Forest Gump)
This hyperparameter sets the regularization strength for the SVM model.
C (inverse of alpha)
This is the term for a SQL query within another SQL query
Subquery
MMC
Maximum Margin Classifier
This famous Swiss mathematician developed a well known distribution to describe large numbers of trials (each individual trial having some probability of being a failure or success).
Jacob (or Jacques) Bernoulli
Whoopi Goldberg is placed in a witness protection program as a nun. Like a hidden layer, she's forced to change her perspective into a new range of outputs.
Sister Activation Function
(Sister Act + Activation Function)
This is the total number of hyperparameters used in a Simple Linear Regression or OLS model.
0
This handy SQL function allows us to fill in NULLs
COALESCE
BIC
Bayesian Information Criterion
This Japanese statistician worked on information theory and developed a way to use parsimony to describe the relative quality of a model given a set of data.
Hirotugu Akaike (Akaike Information Criterion, or AIC)
As a part in this secret agency, Will Smith must embrace the unknown. There is only one term for this type of model that is devoid of any interpretability.
Men in Black Box Model
(Men in Black + Black Box Model)
A Logistic Regression is made from 3 components: A Linear Component, a Random error component, and this component, which bends the linear input into a range between 0 and 1
Logit Link Function
This is the result from the following SQL query:
SELECT type, AVG(attack), AVG(defense)
FROM pokemon
GROUP BY type
ORDER BY AVG(attack) DESC
HAVING AVG(defense) > 50
Syntax Error (ORDER BY comes after HAVING)
DBSCAN
Density-Based Spatial Clustering of Applications with Noise
In 1908, this Englishman working at the Guinness brewery in Dublin published the t-test and t distribution under the name "Student"
William Sealy Gosset
This book by Ernest Cline was turned into a movie by Steven Spielburg and features a dystopia where people find salvation in a game called "The OASIS". The game may be "dummifying", but the players would prefer this other term.
Ready Player One-Hot Encoding
(Ready Player One + One-Hot Encoding)
In the SARIMA model, these are the 7 main hyperparameters to set.
(p,d,q) for the order & (P,D,Q,S) for the seasonal order
This is the meaning of ETL, a common paradigm in data storage
MLE
Maximum Likelihood Estimation
Although Simon Newcomb discovered the phenomenon in 1881, this physicist made the concept more popular in 1938 by publishing a paper titled "The Law of Anomalous Numbers"
Benford's Law (Frank Benford)
This classic Hitchcock horror/thriller features a young man who runs a deadly motel with his "mother". The audience may think the variables involved are independent from each other, but this couldn't be further from the truth.
Psycho-llinearity
(Psycho + collinearity)
In the KNeighborsClassifier from sklearn, this is the name of the hyperparameter for stupulating how distance is calculated
metric
This is the kind of join that returns records that have matching values in both tables
Inner Join