Grab Bag
Values
Datasets
Learning
Problems
100

The science and art of programming computers so they can learn from data

What is machine learning?

100

The desired solution value included with each training item

What is a label?

100

A dataset containing known answers used to train a supervised learning algorithm

What is a labeled dataset?

100

ML technique that learns patterns from labeled data

What is supervised learning?

100

The general solution to make nearly any machine learning algorithm perform better

What is using more data?

200

Predicting the classes of items

What is classification?

200

A single column of data; a single, individual, measurable property or characteristic

What is a feature?

200

A data set that the learning algorithm uses to learn patterns from

What is a training dataset?

200

ML technique that learns patterns from unlabeled data

What is unsupervised learning?

200

The model performs great on the training data but generalizes poorly to new instances

What is overfitting?

300

Predicting target values of items

What is regression?

300

A single row of data, with one value for each of the features

What is a sample or training instance?

300

A data set used to evaluate the final performance of a learning algorithm on novel data

What is a testing dataset?

300

Technique used to hard-code patterns to make predictions from data

What is traditional programming?

300

The selected model is unable to learn the underlying structure of the data

What is underfitting?

400

ML algorithms that train everything in one go

What is batch/offline learning?

400

A value of the learning model that is automatically found by training on the data

What is a parameter?

400

A data set used to evaluate the performance of a learning algorithm for adjusting the hyperparameters of the model

What is a validation dataset?

400

ML technique that learns by having an agent interact with an environment, getting rewards and penalties

What is reinforcement learning?

400

A solution to including irrelevant features which may distract the algorithms

What is feature selection?

500

ML algorithms that are trained incrementally, continually improving with new samples

What is online learning?

500

A parameter of the learning algorithm, not of the model, that is set by the programmer

What is a hyperparameter?

500

FREE

FREE

500

ML technique that uses a minimal amount of labeled data alongside a large amount of unlabeled data

What is semisupervised learning?

500

When the training data is selected in way that it inherently includes nonrepresentative data

What is sampling bias?