What term describes the phenomenon where a model performs well on the training data but fails to generalize to new, unseen data?
What is overfitting?
What is the training error of a 1-Nearest-Neighbor Model?
In the context of hypothesis testing, this type of error occurs when the null hypothesis is incorrectly rejected.
What is a Type-I Error
Which data structure follows a LIFO (Last In First Out) order of operations?
What is a stack?
Who is the president of CMU Data Science Club?
Al

In time series data, what word describes the correlation of a time series with its own lagged values?
What is autocorrelation?
In the context of binary classification in machine learning, this metric is defined as the number of true positive results divided by the number of all positive results, including those not identified correctly.
What is Precision?
This theorem states that, regardless of the shape of the original population distribution, the distribution of the sample mean will approach a normal distribution as the sample size increases.
What is the Central Limit Theorem?
What term is defined as: the number of entries N in table divided by the table capacity M
What is the load factor?
What two numbers are at the beginning of every statistics course offered at CMU?
36
 
 
What type of plot is this, and what do the colors represent?
What is a Mosaic Plot with Pearson Residuals?
This ensemble learning method combines multiple weak learners to create a strong learner for improving model predictions. One common technique involves training subsequent models to correct the errors of previous ones.
What is Boosting?
:max_bytes(150000):strip_icc()/poisson-56a8fa9e3df78cf772a26eb0.jpg)
What distribution is this?
What is the Poisson Distribution
A binary tree in which the value of each node is less than or equal to the value of its parent.
What is a MaxHeap?
Who is Peter Freeman?

What is the name of the dimensionality reduction technique done to transform high-dimensional data into a lower-dimensional space while retaining the most important information?
What is Principal Component Analysis (PCA)?
This loss function, commonly used in training deep neural networks for classification tasks, measures the difference between two probability distributions for a given set of predictions and true values.
What is Cross-Entropy-Loss?

What assumption does our data violate based on this residual plot?
This advanced tree structure balances itself automatically, ensuring a logarithmic height and efficient search, insertion, and deletion operations.
What is an AVL Tree?
What is the ranking of CMU's Graduate Statistics Program on USNews (in terms of all American Grad Schools)?
What is #5?
Which kind of feature regularization works by adding a penalty term based on the squared values of the coefficients?
What is Ridge Regression?
In the realm of neural networks, this complex recurrent neural network architecture, designed to process sequences of data by using a gating mechanism to control the memorizing process, addresses the vanishing gradient problem common to simpler RNNs.
What is Long Short-Term Memory (LSTM)?

What distribution is this?
What is the Pareto Distribution
In the context of memory management, this term refers to a situation where a program incorrectly accesses a memory location outside its allocated space, often resulting in unpredictable behavior.
What is a segmentation fault?
What is the name of the Poker Bot that CMU created in 2019 that defeated the world’s best players at multi-player no-limit Texas Hold’em?
What is Pluribus?