DATA LITERACY DAY 2 Jeopardy Template

REG. ANALYSIS

INF. STATISTICS

PRED. ANALYTICS

PRESC. ANALYTICS

DATA NARRATIVE

100

This type of graph, which compares two quantitative variables, is used in regression analysis

Scatterplot

100

A subset of data from a larger dataset that is used for inferential statistics is known as this

Sample

100

The way humans interact with Generative AI is called this

Prompt

100

When telling a story with data, you need to have these things

Data, Narrative, and Visualizations

100

The most important findings of your analysis should be explained in this portion of the narrative

Climax

200

This value, denoted by the letter R, measures the strength of a linear relationship

Correlation Coefficient

200

This type of test allows us to see if our sample is significantly different from the population

Hypothesis Test

200

A finite sequence of well-defined instructions used to solve a computational problem is known as this

Algorithm

200

The "So-What" and thesis portions of a data story both comprise this element of the story

Main Point

200

The "So-What" statement and thesis of the narrative should be in this portion of the narrative

Initiating Event

300

Mike is an economist who collected data on the age and net worth of a group of people. He calculated the correlation as 0.63 and concluded that a higher net worth is directly caused by increased age. Is he correct? Why or why not?

No, he is incorrect because correlation does not imply causation

300

This quantity is used to show statistical significance

P-value

300

Type of algorithm that is mainly looking at natural patterns of data as opposed to fitting a model is known as what?

Unsupervised Learning

300

This phenomenon is experienced when two things happen to be correlated with each other by chance despite being unrelated to each other

Spurious Correlation

300

We perform descriptive analytics in this section of the narrative

Exposition

400

Stephanie, a Formula-1 superfan, collected data on how quickly each driver accelerated from their starting position. She graphed the distance from start versus the time for each driver on a scatterplot and calculated the correlation to be 0. What can she conclude?

Distance and time have no linear relationship

400

Eveline is a doctor who recently read about a study on a new cancer medication that reported a p-value of 0.23 when compared to the existing cancer drug. If the significance level is 0.05, what should she conclude?

Fail to reject null. Not enough evidence to say new cancer drug is better than the existing one

400

Rich is deciding between creating either a random forest or linear regression model to predict next year's sales for his business. If he wants to know the impact of specific factors on sales, which model should he use and why?

Linear Regression model because it optimizes for interpretability

400

Neil is writing an article in his hometown newspaper about the recent closure of a large factory. Since he is unhappy about the closure, he only cited sources that agreed with his viewpoint in the article. What type of bias does this article have?

Confirmation Bias

400

One of the functions of this section, among others, is to build the scaffolding

Rising Action