Unsupervised Learning
Hodgepodge
Times Series
Bayesian Inference
Experimental Design/
AB Testing
100

We must do WHAT to our data before many unsupervised learning methods (e.g. clustering, recommender systems, PCA).

Scale/Normalise our data


100

You build a model to detect breast cancer - what is your evaluation metric?

Sensitivity

100

Factors that affect a time series with some fixed and known frequency. 

Seasonality

100

How do we determine the likelihood in Bayesian Inference?

Data

100

Another name for the basic "thing" you are experimenting on (think about the rows in your dataset).

Experimental Unit

200

One of two metrics that ranks more similar observations with values close to 0.

Pairwise Distance

200

These methods add complexity to a model, but can also reduce overfitting.

Regularization

200

This pandas method allows you to use previous lags of a time series as features in a time series model. 

.shift()

200

When our prior and likelihood distributions are in the same form.

Conjugacy

200

Another name for the different groups in an experiment.

Treatment

300

One of two metrics that ranks more similar observations with values close to 1.

Cosine Similarity
300

This evaluation metric measure the performance of a regression model but places a greater penalty with an increase in the number of features.

adjusted R^2

300

You perform an ADF test on your time series and get a p-value of 0.03. What do you do?

Nothing! Your data is stationary.

300

The Bayesian equivalent of a confidence interval

Credible Interval


300

In an experiment, this is another term for your variable of interest.

Response Variable

400

This clustering algorithm is less sensitive to outliers.

DBSCAN

400

GridSearch's best params are determined by watch evaluation metric?

Mean cross-validated score

400

You perform an ADF test on your time series and get a P value of 0.09. What do you do?

Difference the data and perform another ADF test.

400

We use these computational methods when our prior and likelihood make calculating the posterior very difficult.

MCMC (Markov chain Monte Carlo)

400

The p-value represents the probability of what?

Obtaining results as or more extreme as we have given the null hypothesis is true.

500

PCA uses what to rank/order our Principle Components 

Proportion of explained variance

500

An SVM that cannot separate our data completely (e.g. there are points on the wrong side of the boundary)  is also called a:  

Soft Margin Classifier

500

The hyperparameter P represents what feature in an ARIMA model?

The AR component; the number of lags to include in our model

500

This distribution is a common choice for a Prior Distribution when your parameter of interest in a probability (e.g. Probability Candidate A will win some election).

Beta Distribution

500

To stratify an experiment based on some fixed effects variable.

Blocking

M
e
n
u