The slope coefficient in a regression line represents:
a) The intercept
b) The rate of change in Y per unit change in X
c) The average of X and Y
d) The residual value
b) The rate of change in Y per unit change in X
Logistic regression predicts:
a) Continuous values
b) Probabilities between 0 and 1
c) Slopes and intercepts
d) Residuals
b) Probabilities between 0 and 1
Which of the following distance measures cannot be used in KNN?
a) Cosine
b) Euclidean
c) Manhattan
d) Minkowski
e) Any of the above can be used
e) Any of the above can be used
If a model performs well on training data but poorly on test data, it’s:
a) Underfit
b) Overfit
c) Optimized
d) Balanced
b) Overfit
The main purpose of PCA is to:
a) Increase data size
b) Reduce dimensionality while keeping maximum variance
c) Cluster data
d) Remove dependent features
b) Reduce dimensionality while keeping maximum variance
Simple linear regression uses a line to approximate the relationship between which of the following?
a. Coefficients and dependent variables
b. Independent variables and residuals
c. Independent and dependent variables
d. None of these
c. Independent and dependent variables
A Naive Bayes classifier assumes that all features are:
a) Dependent
b) Independent
c) Continuous
d) Nonlinear
b) Independent
K-means tries to minimize:
a) Inter-cluster distance
b) Sum of squared distances within clusters
c) The number of clusters
d) The model’s bias
b) Sum of squared distances within clusters
When a model is too simple and fails to capture patterns, it is:
a) Regularized
b) Overfit
c) Underfit
d) Optimized
c) Underfit
LDA differs from PCA because LDA:
a) Ignores class labels
b) Uses labels to maximize class separation
c) Minimizes variance
d) Removes all correlated features
b) Uses labels to maximize class separation
Simple linear regression shows the relationship between:
a) Independent and dependent variables
b) Coefficients and residuals
c) Predicted and actual features
d) Two dependent variables
a) Independent and dependent variables
Which of the following describes one advantage of Logistic Regression over Linear Regression?
a) Logistic Regression is less computationally complex than Linear Regression
b) Logistic Regression has better performance on continuous data than Linear Regression
c) Logistic Regression is less sensitive to outlier data than Linear Regression
d) Logistic Regression is more sensitive to outlier data than Linear Regression
c) Logistic Regression is less sensitive to outlier data than Linear Regression
In DBSCAN, epsilon (eps) defines:
a) The radius of a point’s neighborhood
b) The number of clusters
c) The total number of data points
d) The minimum distance between clusters
a) The radius of a point’s neighborhood
Lasso regression differs from Ridge regression because:
a) Lasso can eliminate features by setting coefficients to zero
b) Lasso cannot handle linear data
c) Ridge regression doesn’t penalize coefficients
d) Lasso increases model variance
a) Lasso can eliminate features by setting coefficients to zero
PCA is an ________ technique, while LDA is a ________ technique.
a) Supervised, Unsupervised
b) Unsupervised, Supervised
c) Regression, Classification
d) Linear, Nonlinear
b) Unsupervised, Supervised
Polynomial regression with a very high degree often leads to:
a) Better generalization
b) More bias
c) Overfitting
d) Simpler models
c) Overfitting
Bayes’ Theorem helps ML models compute:
a) Residual variance
b) Conditional probabilities
c) Regression coefficients
d) Distance metrics
b) Conditional probabilities
Why should training inputs be scaled (standardized and normalized) when using KNN?
a) The inputs do not need to be scaled or normalized for KNN
b) Because KNN is a density-based algorithm
c) To prevent features with larger scales from dominating the distance metric
d) To prevent overfitting if the inputs are not scaled and normalized
c) To prevent features with larger scales from dominating the distance metric
Of these combinations of train and test scores, which would represent the closest to an overfit model?
a) Train: 0.78, Test: 0.59
b) Train: 0.67, Test: 0.65
c) Train: 0.59, Test: 0.78
d) Train: 0.61, Test: 0.61
a) Train: 0.78, Test: 0.59
What do the eigenvectors represent in PCA?
a) The covariance of the features along the diagonal
b) The amount of variance attached to each PC
c)The direction of the one PC with the most variance
d)The directions of the new principal axes
d)The directions of the new principal axes
What does a high R-squared value indicate?
a) The model fits the data well
b) The model is overfit
c) There is no correlation
d) The slope is near zero
a) The model fits the data well
Logistic Regression cannot use a residual calculation (distance from data point to a model classification boundary) due to which of the following?
a) The errors are too large for the sigmoid function
b) The translation of data from sigmoid to linear results in values of +/- infinity
c) The mean squared error results in divide by 0 values from the sigmoid
d) Residuals cannot be defined for categorical outcomes
b) The translation of data from sigmoid to linear results in values of +/- infinity
Which of the following best describes how a cluster is formed in DBSCAN?
a) By iteratively minimizing the distances between points and their cluster centers
b) By connecting dense regions of points based on the epsilon (eps) neighborhood and the minimum points criteria
c) By assigning every data point to the nearest cluster
d) Minimize the number of noise points in a cluster by removing any point that is within its neighborhood
b) By connecting dense regions of points based on the epsilon (eps) neighborhood and the minimum points criteria
Increasing the regularization parameter (λ) in Lasso regression generally:
a) Decreases bias
b) Increases variance
c) Increases bias, reduces variance
d) Has no effect
c) Increases bias, reduces variance
Which of the following is the best definition of 'within-class scatter' in LDA?
a. Distance perpendicular to the dataset class categories center and the overall dataset center
b. Distance along the axis of maximum variation between all data in the dataset
c. Distances of each sample in the dataset class to the mean of the same dataset class
d. None of these
e. Distances from each of the dataset class categories center to the overall dataset center
c. Distances of each sample in the dataset class to the mean of the same dataset class