Topics Before the Midterm
Topics Before the Midterm pt 2.
Machine Learning
Mixed Bag

What is Nominal data, Ordinal Data, Discrete Data, and Continuous Data

Nominal data: Data that can be labeled or classified into mutually exclusive categories doesnt have an order(hair color, gender, ethnicity)

Ordinal Data: Groups variables into ordered categories

Discrete Data: Data that includes whole number figures (Number of people, number of doors)

Continous Data: Data can take any value (including decimals) (Height, length, temperature)


What is a Key informant interview?

What is a focus group interview?

Qualitative data collection method to gather in depth information from individuals who have specialized knowledge or insights about a particular topic and or a particular group of people

Participatory and qualitative research method involving a small diverse group of people who are part of a target population


What is supervised learning 

What is unsupervised learning

Supervised Learning: A type of machine learning in which an algorithm is trained on a labeled dataset, where the outputs are already. known

Unsupervised Learning: Methods attempt to learn some kind of structure underlying the data . It does not require labeled data discovers patterns in the data without any prior knowledge


What are the Steps to Building a Dashboard?

1.Understand the Context

2.Define the objective

3.Identify your audience

4.Familiarize yourself with your data

5.Draw your dashboard on paper

6.How to sell your dashboard in an organization


What is the Food Consumption Score? What is it a proxy indicator to measure?

Composite score based on dietary diversity, food frequency and relative nutritional importance

It is calculated using the frequency of consumption of diff food groups


What does ETL mean?

Extract, Transform, and Load

Extract: Retrieves and verifies data from various sources

Transform: Processes and Organizes extracted data so it is usable

Load: Moves transformed data to a data repository


What is a PII 

Any type of information relative to a physical person that can lead to its identification 


What is K-Means Clustering ?

An unsupervised learning algorithm used for clustering similar data points into K groups. Doesnt require labeled data


What is a Convolutional Neural Network

Specialized type of NN where each layer consists on a set of filters that transform the input data so that the NN can learn the features of the data it is being fed

particularly effective for image recognition


What is the HDD indicator

Dietary Diversity of different food groups consumed over a given reference period.


What is an API?

Application Programming Interface

Software intermediary that allows two applications to talk to each other


What are Hallucinations, Biased and unfair models, (Risks of LLM)

Hallucinations: Incorrect information given as true

Biased and unfair models: AI models can inherit or amplify biases present in their training data leading to unfair or discriminatory outcomes


Is logistic regression, deep learning and CNNS types of supervised or unsupervised learning 



What is a Box plot helpful for?

Helpful for Summarizing multiple distributions by showing median


What is the MDD-W indicator? How is it measured?

It is a dichotomous indicator of whether or not women 15-49 years of age have consumed 5 out of 10 defined food groups the previous day or night


What is Generative Artificial Intelligence

AI models trained on very large volumes of data that are able to understand context and generate outputs based on a natural language input prompt


What is Geospatial data?

Data point that is geo-referenced 


What is the Accuracy equation?

What is the Precision equation? 

What is the Recall equation? 

Accuracy = Correct Predictions/Total Predictions

Precision = True Positives/True Positives + False Positives

Recall = True Positives/True Positives+ False Negatives


What are recurrent neural networks

Specialized type of NN that has a recurrent layer that allows information to be looped back into the network allowing the network to maintain a memory of previous inputs

Specifically designed to deal with inputs that have a time dependency


What does the HFIAS capture

The HFIAS was designed to capture house hold behaviors signifying insufficient quality and quantity of food as well as anxiety and uncertainty over household access or food supply


What are LLMs

AI Models trained on large volumes of text, capable of understanding and generating high level of coherence and relevance


In terms of data protection what should you think about when considering proportionality?

Is the data being collected strictly for the stated purposes?

Are there measures in place to prevent the use of data for purposes other than those intially agreed upon?


Explain how deep learning works

each neuron computes a linear combination of some weights it has learned of the input it receives from the previous neuron. once the linear combination has been computed the neuron puts the output through an activation function that is used to generate a prediction or a specific action


What is the goal of Classification 

To build a model that can assign input to one of several predefined categories.

The most common classification algorithms include decision trees, logistic regression, Naive Bayes


How is Food Consumption Score Measured?

Food Items are grouped into 8 food groups. And household supposed to respond the frequency of diff food groups consumed by a the household 7 days before the survey

the group is then assigned a weight based on nutrient content 

, Those values summed and that is how you get the FCS score
