The science and art of programming computers so they can learn from data
What is machine learning?
The desired solution value included with each training item
What is a label?
A dataset containing known answers used to train a supervised learning algorithm
What is a labeled dataset?
ML technique that learns patterns from labeled data
What is supervised learning?
The general solution to make nearly any machine learning algorithm perform better
What is using more data?
Predicting the classes of items
What is classification?
A single column of data; a single, individual, measurable property or characteristic
What is a feature?
A data set that the learning algorithm uses to learn patterns from
What is a training dataset?
ML technique that learns patterns from unlabeled data
What is unsupervised learning?
The model performs great on the training data but generalizes poorly to new instances
What is overfitting?
Predicting target values of items
What is regression?
A single row of data, with one value for each of the features
What is a sample or training instance?
A data set used to evaluate the final performance of a learning algorithm on novel data
What is a testing dataset?
Technique used to hard-code patterns to make predictions from data
What is traditional programming?
The selected model is unable to learn the underlying structure of the data
What is underfitting?
ML algorithms that train everything in one go
What is batch/offline learning?
A value of the learning model that is automatically found by training on the data
What is a parameter?
A data set used to evaluate the performance of a learning algorithm for adjusting the hyperparameters of the model
What is a validation dataset?
ML technique that learns by having an agent interact with an environment, getting rewards and penalties
What is reinforcement learning?
A solution to including irrelevant features which may distract the algorithms
What is feature selection?
ML algorithms that are trained incrementally, continually improving with new samples
What is online learning?
A parameter of the learning algorithm, not of the model, that is set by the programmer
What is a hyperparameter?
FREE
FREE
ML technique that uses a minimal amount of labeled data alongside a large amount of unlabeled data
What is semisupervised learning?
When the training data is selected in way that it inherently includes nonrepresentative data
What is sampling bias?