Define Data Bias
Data that does not accurately reflect the full population or phenomenon being studied
True or false: A research lab decides that it would take too many resources to preform their own research. Crowdsourcing is a good option here. Explain your answer.
True. Crowdsourcing will give them an easy way to get the data they are looking for without needing to preform their own experiments.
Name a type of chart or graph for data analysis
Crosstab, Scatter Plot, Bar Chart, Histogram
What are ethics?
moral principles that govern a person's behavior or the conducting of an activity
Define Metadata
Data about the data
Columns could be named differently and not give you an accurate reading.
Why does someone filter data?
To better focus on a specific part of a data set.
Give one example of data bias
One example of data bias could be a facial recognition system not being able to recognize people with a darker skin tone.
Define Open Data
Data that is freely available to anyone
You are researching the number of crows in your area and their specific beak size. Which data set would be better fit to use: Bar Chart or Histogram?
Histogram
What does a crosstab chart show?
It shows how many times a certain combination of values occurred.
Name one way you can avoid data bias
Collect more diverse data, review the found data with others, etc.
Define Big Data
A large data set that allows for larger amounts of data to be collected.
A company that is developing an AI decides to test it on a small group of people in their local area. Why is this AI likely to be biased as a result of this test?
The AI will be tailored only to people in that local area and not everyone regardless of any predispositions.
What is the data analysis process?
Choose or Collect Data -> Clean and/or Filter -> Visualize and Find Patterns -> New Information
A company decides to train their AI model on a data set for the most used words in English. This dataset only contains data for English speakers in the UK. Is this biased?
Yes. It is likely to work better when presented with British English.
Define Machine Learning
a branch of AI that focuses on the use of algorithms and data to copy the way that humans learn, slowly improving its accuracy.
A teacher buys a single-use program and installs it on all of the school computers. Is this unethical? Why or why not?
This is unethical because they are using a copy meant as a one time use more times than they are permitted to use it
What is Crowdsourced data?
Data that has been collected from a large number of people usually via the Internet.
True or False? An A.I. being biased towards a certain group of people is its own fault.
Even if unintentional, an A.I. being biased towards certain groups is taught by its developers, therefore, the developers are held responsible for its behavior.