What is a crosstab chart?
It counts how many times combinations of values appear. Arrows show where that row in the data table would be counted in the chart.
What is a scatter chart?
Shows combinations of values from two columns
What is open data?
sharing data with others so they can can analyze it"
Open data is publicly available data shared by governments, organizations, and others
Making data open help spread useful knowledge or creates opportunities for others to use it to solve problems
What is big dtta?
Collect huge amounts of data so we can learn even more from it"
What is correlation?
a relationship between two pieces of data, typically referring to the amount that one
How are crosstab charts useful?
Finding the most / least common combinations of values in two columns
Finding patterns across two columns
Exploring two columns when one or both are strings.
How are scatter useful?
Seeing patterns and trends between two values
Numeric data with lots of different values
What is crowdsourcing?
Crowdsourcing is the practice of obtaining input or information from a large number of people via the Internet. Crowdsourcing offers new models for collaboration, such as connecting businesses or social causes with funding Both are examples of how human capabilities can be enhanced by collaboration via computing
What is data bias?
data that does not accurately reflect the full population or phenomenon being studied
What is citizen science?
Citizen science is research where some of the data collection is done by members of the public using own computing devices which leads to solving scientific problems.
Why are crosstab charts not useful?
If either column has too many values (the chart would be enormous)
Why are scatter not useful
Lots of repeated values
What is data filtering
choosing a smaller subset of a data set to use for analysis
ex: by eliminating / keeping only certain rows in a table
What did we learn about machine learning?
-artificial intelligence
-the extraction of knowledge from data based on algorithms created from training data
What is this?
Crosstab
What is this?
Scatter
What is this?
When does Data need to be cleaned?
When data is incomplete, invalid, and multiple tables are combined into one
A town decides to publicize data it has collected about electricity usage around the city. The data is freely available for all to use and analyze in the hopes that it is possible to identify more efficient energy usage strategies.
Which of the following does this situation best demonstrate?
Open data
Which graphs are only useful for looking at one column of data?
Bar charts and histograms
data about data