k-nearest neighbor
Cloud Computing
Matrix Factorization
Machine Learning
General Terminology
100
Occurs when a new user or item is introduced to the system and there is not enough information available to make accurate recommendations. 

What is the cold start problem?

100

A type of cloud computing service that offers essential compute, storage, and networking resources on demand, on a pay-as-you-go basis. 

What is Infrastructure as a Service?

100
A mathematical function that measures the difference between the predicted output of a model and the actual output. 

What is a cost function?

100

The system is trained on a labelled dataset, where the desired output is already known. The goal is to learn a mapping between the input and output variables. 

What is supervised learning?

100

a statistical measure used to evaluate the performance of a machine learning model in terms of precision and recall.

FOR PAPER 3: STUDENTS SHOULD KNOW HOW TO CALCULATE THIS AND INTERPRET RESULTS

What is f-measure?

200

A value that is set before the training of a machine learning model and controls the behavior of the model. Examples include the learning rate in a neural network, the number of trees in a random forest, or the number of clusters in a k-means algorithm. 

What is a hyperparameter?

200

a cloud computing model that provides customers a complete cloud platform - hardware, software, and infrastructure - for developing, running, and managing applications without the cost, complexity, and inflexibility that often comes with building and maintaining that platform on premises. 

What is a Platform as a Service (PaaS)?

200

A phenomenon in machine learning where a model is trained too well on the training data and performs poorly on new, unseen data. 

FOR PAPER 3: students should know how to identify and prevent this, and how to use techniques such as regularization. 

What is overfitting?

200

The system is given a dataset without any labels, and the goal is to identify patterns and relationships with the data. This is useful for exploratory data analysis and for  finding structure in data. 

What is unsupervised learning?

200

a measure of the average magnitude of the errors in a set of predictions, without considering their direction. 

FOR PAPER 3: STUDENTS SHOULD KNOW HOW THIS CAN BE USED TO EVALUATE THE PERFORMANCE OF A MODEL. 

What is mean absolute error (MAE)?
300

Refers to the set of data used to train a machine learning model. 

What is training data?

300

a software licensing and delivery model in which software is licensed on a subscription basis and is centrally hosted. 

What is Software as a Service (SaaS)?
300

The two matrices that are the result of the user-item matrix decomposition. 

What are the user-feature matrix and item feature matrix?

300

The system is trained through trial and error, receiving rewards or penalties based on its actions. The goal is to learn the best actions to take in a given situation to maximize the rewards. 

What is reinforcement learning?

300

Refers to the ability for individuals to control the collection, use, and dissemination of their personal information. 

FOR PAPER 3: students should know the importance of anonymity in online interactions...and various technologies that can be used for anonymity, such as VPNs, Tor, and blockchain. 

What is the right to privacy?

400
A common problem in recommendation systems where popular items are recommended more often than less popular items. 

What is popularity bias?

400

A cloud deployment model operated solely for an organization. 

What is a private cloud deployment model?

400

A measure of the difference between predicted values and actual values in a dataset. Calculated as the square root of the mean of the squared differences between predicted and actual values

FOR PAPER 3: students should know how to calculate this, interpret the results and compare it with other measures. 

What is root-mean-square error (RMSE)?

400

This approach uses user-item interactions to find similar users and make recommendations based on their preferences. It can be implemented using memory-based or model-based techniques. 

What is collaborative filtering?

400

an optimization algorithm used to minimize a cost function in machine learning by iteratively updating the model parameters in the direction of the negative gradient. 

FOR PAPER 3: students should know about the advantages and disadvantages compared to other optimization algorithms. 

What is stochastic gradient descent?

500
A mathematical expression used to determine the performance of a machine learning model. 

What is the cost function?

500

Refers to data that is collected from observing and recording the actions, decisions, and interactions of individuals or groups. 

What is behavioural data?

500

What is the main difficulty with using matrix factorization?

The dataset is not complete. 

500

This approach uses the properties or features of items to make recommendations. It generates recommendations by finding items that are similar to items that the user has liked in the past. 

What is content based filtering?

500
A type of machine learning that focuses on training agents to make decisions in an environment by rewarding or punishing them based on their actions. 


FOR PAPER 3: students should know the fundamental concepts such as Markov Decision Process, Q-learning, and policy gradient methods. 

What is reinforcement learning?

M
e
n
u