We must do WHAT to our data before many unsupervised learning methods (e.g. clustering, recommender systems, PCA).
Scale/Normalise our data
You build a model to detect breast cancer - what is your evaluation metric?
Sensitivity
Factors that affect a time series with some fixed and known frequency.
Seasonality
How do we determine the likelihood in Bayesian Inference?
Data
Another name for the basic "thing" you are experimenting on (think about the rows in your dataset).
Experimental Unit
One of two metrics that ranks more similar observations with values close to 0.
Pairwise Distance
These methods add complexity to a model, but can also reduce overfitting.
Regularization
This pandas method allows you to use previous lags of a time series as features in a time series model.
.shift()
When our prior and likelihood distributions are in the same form.
Conjugacy
Another name for the different groups in an experiment.
Treatment
One of two metrics that ranks more similar observations with values close to 1.
This evaluation metric measure the performance of a regression model but places a greater penalty with an increase in the number of features.
adjusted R^2
You perform an ADF test on your time series and get a p-value of 0.03. What do you do?
Nothing! Your data is stationary.
The Bayesian equivalent of a confidence interval
Credible Interval
In an experiment, this is another term for your variable of interest.
Response Variable
This clustering algorithm is less sensitive to outliers.
DBSCAN
GridSearch's best params are determined by watch evaluation metric?
Mean cross-validated score
You perform an ADF test on your time series and get a P value of 0.09. What do you do?
Difference the data and perform another ADF test.
We use these computational methods when our prior and likelihood make calculating the posterior very difficult.
MCMC (Markov chain Monte Carlo)
The p-value represents the probability of what?
Obtaining results as or more extreme as we have given the null hypothesis is true.
PCA uses what to rank/order our Principle Components
Proportion of explained variance
An SVM that cannot separate our data completely (e.g. there are points on the wrong side of the boundary) is also called a:
Soft Margin Classifier
The hyperparameter P represents what feature in an ARIMA model?
The AR component; the number of lags to include in our model
This distribution is a common choice for a Prior Distribution when your parameter of interest in a probability (e.g. Probability Candidate A will win some election).
Beta Distribution
To stratify an experiment based on some fixed effects variable.
Blocking