Machine Learning Jeopardy Template

Basic Concepts

Medium Level

Advanced Topics

100

What is Machine Learning?

Machine Learning is the field of study that allows computers to learn from data and improve performance without explicit programming.

100

What is the key difference between supervised and unsupervised learning?

The key difference is that supervised learning uses labeled data, while unsupervised learning uses unlabeled data.

100

What are the advantages of using Random Forest over a single Decision Tree?

Random Forest reduces overfitting, improves accuracy by combining multiple decision trees, and handles missing data better

200

What does "training" mean in machine learning?

Training refers to the process of feeding data to a model so it can learn patterns and make predictions.

200

What is reinforcement learning

Reinforcement learning is a type of machine learning where an agent learns by interacting with the environment and receiving rewards or penalties.

200

What is cross-validation used for?

Cross-validation is used to test how well a machine learning model will work on new data by splitting the data into parts and testing the model on each part.

300

What is a Decision Tree in machine learning?

A decision tree splits data into branches based on conditions to help make decisions

300

How does the elbow method help in K-Means clustering?

The elbow method helps determine the optimal number of clusters by identifying the point where adding more clusters results in diminishing improvements to the model

300

What is the role of max_depth in Random Forest, and why is it important?

max_depth limits how deep each decision tree can grow. A deeper tree captures more detail, but may overfit, while a shallow tree may underfit the data

400

How do machine learning algorithms differ from traditional algorithms?

Machine learning algorithms learn from data to make predictions or decisions, while traditional algorithms follow predefined rules created by programmers without learning from data.

400

What is the role of 'k' in K-Nearest Neighbors (KNN)?

'k' represents the number of neighbors considered when classifying a data point

400

What is the difference between overfitting and underfitting in a machine learning model?

Overfitting occurs when a model performs well on training data but poorly on unseen data due to learning noise and irrelevant details.

Underfitting happens when the model is too simple and fails to capture the underlying patterns.

500

In the following code, what is n_estimators used for in Random Forest?

from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)

n_estimators specifies the number of trees in the Random Forest

500

What does the fit method do in this K-Means clustering example?

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)

The fit method assigns each data point in X to one of the 3 clusters

500

What is the role of eps in DBSCAN?

from sklearn.cluster import DBSCAN
db = DBSCAN(eps=0.3)
db.fit(X)

eps defines the maximum distance between points to be considered neighbors for clustering