_____ and _____ are the two major interpretations of probability
Frequentist and Bayesian
Where can an estimator exist in a pipeline?
As the last element
What is the only required hyperparameter of HDBSCAN?
min_samples
What should be the best choice of no. of k clusters based on the following results?

3
What is the angle of PC1?

45 degrees
These are the two distributions/probabilities of event named in Bayes' Rule.
Prior and posterior
The three methods available to a Pipeline
What are .fit(), .predict(), .score()?
What is the epsilon hyperparameter of DBSCAN do?
The searching distance (radius of the searching circle) when attempting to build a cluster
What is a reasonable choice for k?

6
Which of the above graph shows better performance of PCA?

Left
the probability of getting a "Diamond 4" card given you know the card you get is a 4
1/4
The keyword to specify in GridSearchCV the various configurations to try
What is param_grid?
This linkage method measures the distance between two clusters is the minimum distance between the closest points between clusters?
Single Linkage
What is the difference between agglomerative hierarchical clustering and divisive hierarchical clustering?
What will happen when eigenvalues are roughly equal?
PCA will perform badly
the probability of a "heart 8" given the card is black
0
What is the difference between Pipeline and make_pipeline
make_pipeline assigns the step names for you
What does DBSCAN fully stand for?
Density-Based Spatial Clustering of Applications with Noise
What is optimal in terms of cohesion and separation?
Small cohesion (close to zero)
Large separation
What two assumptions does PCA make about the data to work properly?
What is the denominator in the formula for the Bayes Theorem?
P(B) or the probability of the actual data
Given `pipe = Pipeline([('preprocessing', StandardScaler()), ('classifier', SVC())])` How to specify other models to try on GirdSearchCV in param_grid?
pipe = Pipeline([('preprocessing', StandardScaler()), ('classifier', SVC())])
grid = GridSearchCV(pipe, parameters, scoring = 'neg_mean_absolute_error')
Define border points using the terms hyperparameter min_samples and epsilon
Points still within the cluster that do not satisfy min_samples within epsilon
This linkage method that measures the distance between two clusters as the maximum distance between an observation in one cluster and an observation in the other cluster?
Complete Linkage
What should we always do before PCA?
Standardization