Definitions
Scenarios
Data
Ethics / Bias
100

Define Data Bias

Data that does not accurately reflect the full population or phenomenon being studied

100

True or false: A research lab decides that it would take too many resources to preform their own research. Crowdsourcing is a good option here. Explain your answer.

True. Crowdsourcing will give them an easy way to get the data they are looking for without needing to preform their own experiments.

100

Name a type of chart or graph for data analysis

Crosstab, Scatter Plot, Bar Chart, Histogram

100

What are ethics?

moral principles that govern a person's behavior or the conducting of an activity

200

Define Metadata

Data about the data

200
A company collects data from a survey but never filters it. What is a likely problem that could arise from this decision?

Columns could be named differently and not give you an accurate reading.

200

Why does someone filter data?

To better focus on a specific part of a data set.

200

Give one example of data bias

One example of data bias could be a facial recognition system not being able to recognize people with a darker skin tone.

300

Define Open Data

Data that is freely available to anyone

300

You are researching the number of crows in your area and their specific beak size. Which data set would be better fit to use: Bar Chart or Histogram?

Histogram

300

What does a crosstab chart show?

It shows how many times a certain combination of values occurred.

300

Name one way you can avoid data bias

Collect more diverse data, review the found data with others, etc.

400

Define Big Data

A large data set that allows for larger amounts of data to be collected.

400

A company that is developing an AI decides to test it on a small group of people in their local area. Why is this AI likely to be biased as a result of this test?

The AI will be tailored only to people in that local area and not everyone regardless of any predispositions.

400

What is the data analysis process?

Choose or Collect Data -> Clean and/or Filter -> Visualize and Find Patterns -> New Information

400

A company decides to train their AI model on a data set for the most used words in English. This dataset only contains data for English speakers in the UK. Is this biased? 

Yes. It is likely to work better when presented with British English.

500

Define Machine Learning

a branch of AI that focuses on the use of algorithms and data to copy the way that humans learn, slowly improving its accuracy.

500

A teacher buys a single-use program and installs it on all of the school computers. Is this unethical? Why or why not?

This is unethical because they are using a copy meant as a one time use more times than they are permitted to use it

500

What is Crowdsourced data?

Data that has been collected from a large number of people usually via the Internet.

500

True or False? An A.I. being biased towards a certain group of people is its own fault.

Even if unintentional, an A.I. being biased towards certain groups is taught by its developers, therefore, the developers are held responsible for its behavior.