Quiz 4 Jeopardy

Tree-Based Methods

Ensemble Methods

Gradient Descent

SQL

100

These models, like CARTs, don't make any assumptions on the distribution of the data. In addition, features are considered independently.

What are non-parametric models?

100

This is the main philosophy behind ensemble methods.

What is using multiple models to make a better averaged model? (also accepted: Wisdom of the Crowds)

100

This term describes the value a model finds when it converges but it's not the optimum result.

What is local minimum?

100

Used to pick column from a table.

What is SELECT?

200

These are the criteria that classification/regression decision trees use, respectively.

What are gini index and MSE?

200

This is building several parallel models on random bootstrapped samples, then averaging the predictive power of each model.

What is bagging (bootstrap aggregating)?

200

The reason why an algorithm does not converge.

What is 𝛼 (step size) is too big?

200

Allows you to select columns by a condition.

What is WHERE?

300

This is the gini index value when the classes are perfectly balanced in a node, while this is the gini index when there is only one class represented in a node, respectively.

What are 0.5 and 0, respectively?

300

True or False: When fitting a random forest classifier, the sample of observations is randomly drawn without replacement.

What is "False"? Random observations are taken with replacement.

300

This is the reason my model has been running for the last 6 hours with no end in sight.

What is 𝛼 (step size) is too small?

300

This is the order of operations of a SQL query (the SQL commands! Not the mnemonic).

What are:

SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT?

400

Deep trees suffer from this kind of error, while shallow trees usually suffer from that kind of error.

What are variance and bias, respectively?

400

This is why ensemble methods often perform better than non-ensemble methods.

What is they greatly reduces overfitting by using multiple models?

400

A common tuning parameter that guarantees termination but not necessarily convergence of my model.

What is max_iter(ations)?

400

One is used before GROUP BY and the other is used together with and after GROUP BY, respectively.

What are WHERE and HAVING, respectively?

500

These 3 hyperparameters can be used to reduce decision trees' tendency to overfit.

What are max_depth, min_samples_split and max_leaf_nodes?

500

These are the three main differences between bagging and boosting.

What are:

- Models built in parallel vs. sequential
- Aggregate results vs. learning from each model in turn
- Random subsets of data make each model unique vs learning from the mistakes of the last model to make the next one better

500

The technique where 𝛼 (step size) is drawn from the normal distribution and is changed at each iteration.

What is Adaptive Gradient Descent?

500

1st returns records with matching values in BOTH tables; the 2nd returns all records with a match in EITHER the left or right table

What are INNER and OUTER JOIN, respectively?