Metric Ton
Bears II
Wiggly Biters
Project Runway
Potent Potables
200

The classification metric "True Negative Rate" is also known as this

What is Specificity?

200

If a model scores very well on it's training set, but much more poorly on the testing set, the model is likely suffering from this

What is overfitting?

200

This character must be at the end of the first line when defining a function

What is a colon (:)?

200

sklearn's train_test_split returns four objects, in exactly this order

What are X_train, X_test, y_train, y_test?

200

At the 5% significance level, a null hypothesis is rejected in favor of the alternate hypothesis when this value is below 0.05

What is a p-value?

400

In an sklearn LogisticRegression classifier model, this is the default scoring metric of the object

What is accuracy?

400

This Pandas method will return a boolean DataFrame or Series indicating whether or not a value is np.nan

What is isnull()?

What is isna()?

400

In an sklearn train_test_split, the proportional size of the testing set can be controlled by assigning this argument

What is test_size?

400

This hyperparameter controls the strength of regularization in Ridge Regressions.

What is C?

400

This is the term for describing two correlated features

What is Multicollinearity?

600

False positives are known as this type of error

What is Type I error?

600

In order to correctly create an alias that includes spaces (e.g. dept name), this/these character(s) are required in SQL syntax

What are double quotes ("")?

600

Use this to create an anonymous function

What is a lambda function?

600

This SQL clause is executed first

What is the FROM clause?

600

This special hypothesis test is used to determine whether a time series is stationary

What is an Augmented Dickey-Fuller Test?

800

This is the full name of the curve that shows the trade-off between the true positive rate and false positive rate.

What is Area Under the Curve - Receiver Operating Characteristic

800

`.map()` is a method available with this data type

What is a pandas.Series?

800

This non-parametric model incorporates random sampling with replacement of observations and also uses all features.

What are bagged trees?

800

This is the term describing when a set of random variables have constant variance

What is homoskedasticity?

800

In a K Nearest Neighbors classifier, this distance metric is also the shortest distance between two points in vector space.

What is Euclidean distance?

1000

This is the loss function that neural networks typically use to handle multiclassification problems.

What is Categorical Crossentropy?

1000

This method will transform a sparse matrix to a dense matrix.

What is `to_dense()`?

1000

Arguments in a function that have default values are known as this.

What are kwargs?

1000

The "L1" penalty is more commonly known as a regression with this form of regularization.

What is LASSO regression?

1000

This is the link function that bends a Linear Regression into a Logistic Regression

What is the logit function?