Tree Based Methods
Ensemble Methods
GLMs & SVMs
Gradient Descent
SQL
100

These models, like CARTs, don't make any assumptions on the distribution of the data. In addition, features are considered independently.

What are non-parametric models? 

100

This is the main philosophy behind ensemble methods.

 What is using multiple models to make a better averaged model? (also accepted: Wisdom of the Crowds)

100

Math formula that "forces" the data into a higher dimension and helps create/find the best linear boundary between classes. 

Kernel (Kernel Trick is accepted as well)

100

This term describes the value a model finds when it converges but it's not the optimum result.    

What is local minima?
100

Used to pick column from a table

What is SELECT

200

These are the criteria that classification/regression decision trees use, respectively.

What are gini index and MSE? 

200

This is building several parallel models on random bootstrapped samples, then averaging the predictive power of each model.

What is Bagging?

200

This regularization parameter controls the "leniency" for misclassification. Increasing it leads to less lenient boundaries while decreasing it leads to more lenient lenient ones. 

What is "C"? 

200

The reason why an algorithm does not converge

What is 𝛼 too big?  

200

Allows you to select columns by a condition

What is WHERE?

300

This is the gini index value when the classes are perfectly balanced in a node, while this is the gini index when there is only one class represented in a node, respectively 

What are 0.5 and 0, respectively?

300

True or false - When fitting a random forest classifier the sample of observations is randomly drawn without replacement.

What is - False - It is with replacement.

300

The three components of all GLMs

What are linear, link and random? 

300

This is the reason my model has been running for the last 6 hours with no end in sight.

What is my 𝛼 too small?

300

Order you write common commands in an SQL query

What are:

SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT?

400

Deep trees suffer from error due to this, while shallow trees usually suffer from error due to that

What are variance and bias, respectively

400

This is why ensemble methods often perform better than non-ensemble methods

What is greatly reduces overfitting by using multiple models?

400

A type of regression we'll use to predict/model a discrete value between 0 and ∞

What is Poisson Regression?

400

A common tuning parameter that guarantees termination but not necessarily conversion of my model.

What is max_iterations?

400

One is used before GROUP BY and the other is used together and after GROUP BY

What are WHERE and HAVING, respectively? 

500

These hyperparemeters can be used to reduce decision trees tendency to overfit

What are max_depth, min_samples_split and max_leaf_nodes?

500

These are the three differences between Bagging and boosting

What are: 

Parallel vs sequential
Aggregate results vs learning from each model in turn
Random subsets of data make each model unique vs learning from the mistakes of the last model to make the next one better?

500

A type of regression we'll use to predict/model a continuous value between 0 and ∞

What is Gamma Regression?

500

The general name for what gradient descent is trying to minimize.

What is the loss function? (or what is the cost function)

500

1st returns records with matching values in both tables; the 2nd returns all records with a match in either left or right table

What are INNER and OUTER JOIN, respectively 

M
e
n
u