What is Machine Learning?
What is the key difference between supervised and unsupervised learning?
The key difference is that supervised learning uses labeled data, while unsupervised learning uses unlabeled data.
What are the advantages of using Random Forest over a single Decision Tree?
Random Forest reduces overfitting, improves accuracy by combining multiple decision trees, and handles missing data better
What does "training" mean in machine learning?
Training refers to the process of feeding data to a model so it can learn patterns and make predictions.
What is reinforcement learning
Reinforcement learning is a type of machine learning where an agent learns by interacting with the environment and receiving rewards or penalties.
What is cross-validation used for?
Cross-validation is used to test how well a machine learning model will work on new data by splitting the data into parts and testing the model on each part.
What is a Decision Tree in machine learning?
A decision tree splits data into branches based on conditions to help make decisions
How does the elbow method help in K-Means clustering?
The elbow method helps determine the optimal number of clusters by identifying the point where adding more clusters results in diminishing improvements to the model
What is the role of max_depth in Random Forest, and why is it important?
max_depth limits how deep each decision tree can grow. A deeper tree captures more detail, but may overfit, while a shallow tree may underfit the data
How do machine learning algorithms differ from traditional algorithms?
Machine learning algorithms learn from data to make predictions or decisions, while traditional algorithms follow predefined rules created by programmers without learning from data.
What is the role of 'k' in K-Nearest Neighbors (KNN)?
'k' represents the number of neighbors considered when classifying a data point
What is the difference between overfitting and underfitting in a machine learning model?
Overfitting occurs when a model performs well on training data but poorly on unseen data due to learning noise and irrelevant details.
Underfitting happens when the model is too simple and fails to capture the underlying patterns.
In the following code, what is n_estimators used for in Random Forest?
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
n_estimators specifies the number of trees in the Random Forest
What does the fit method do in this K-Means clustering example?
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)
The fit method assigns each data point in X to one of the 3 clusters
What is the role of eps in DBSCAN?
from sklearn.cluster import DBSCAN
db = DBSCAN(eps=0.3)
db.fit(X)
eps defines the maximum distance between points to be considered neighbors for clustering