DATA LITERACY DAY 2

REG. ANALYSIS

INF. STATISTICS

PRED. ANALYTICS

PRESC. ANALYTICS

DATA NARRATIVE

100

This type of graph, where we compare two quantitative variables, allows us to do regression analysis

Scatterplot

100

When we take a small subset of our data to perform inferential statistics, we are looking at this

Sample

100

The way humans interact with Generative AI is called this

Prompt

100

When telling a story with data, you need to have these things

Data, Narrative, and Visualizations

100

The most important findings of your analysis should be explained in this portion of the narrative

Climax

200

A quantity that measures the strength of a linear relationship, denoted by the letter R, is known as this

Correlation

200

This is the procedure we perform when we are looking to see if our sample is different from the population

Hypothesis Test

200

A finite sequence of well-defined instructions used to solve a computational problem is known as this

Algorithm

200

The "So-What" and thesis portions of a data story both comprise this element of the story

Main Point

200

The "So-What" statement and thesis of the narrative should be in this portion of the narrative

Initiating Event

300

When we examine point on a scatterplot that are far away from the line of best fit, what type of points are we studying?

Influential Points

300

This quantity is what we use to determine statistical significance

P-value

300

Type of algorithm that is mainly looking at natural patterns of data as opposed to fitting a model is known as this

Unsupervised Learning

300

This phenomenon is experienced when two things happen to be correlated with each other by chance despite being unrelated to each other

Spurious Correlation

300

We make final recommendations to stakeholders in this portion of the narrative

Resolution / Conclusion

400

Mike is studying the relationship between high temperatures and turkey sales. He calculated the correlation as -0.98. Mike concludes that the lower temperatures causes turkey sales to increase. Is he correct? Why or why not?

No, he is incorrect because correlation does not imply causation

400

We compare the p-value to this quantity when we determine statistical significance

Significance level

400

Rich is creating an AI model that aims to predict with a high degree of accuracy the time it takes to drive to San Diego on any given day. What type of model should he use?

Neural Network

400

When we try to understand how the audience perceives our explanations, we are studying this

Psychology of Data

400

We perform descriptive analytics in this section of the narrative

Exposition

500

Stephanie is the leader of her local rocketry club. She measured the position and acceleration of her rocket at launch and calculated the correlation to be 0. What can she conclude?

Position and acceleration have no linear relationship. However, they are related in a different way

500

Kermit wanted to compare whether a new design of a lug nut for his car tires is significantly better than the conventional design. He obtained a p-value of 0.44. Assuming the significance level is 0.05, what should he conclude?

Fail to reject null. Not enough evidence to say new lug nut is better than conventional one

500

Think back to our case study. Why did we pick the linear regression model out of all the others?

We wanted to have high interpretability (see the specific parameters). Also, we are predicting a value instead of classifying (eliminates logistic regression)

500

Neil performed a study on whether caffeine improves how someone drives a car. He only chose Mountain Dew drinkers in his study to compare with non-caffeine consumers. What kind of bias is his study most likely experiencing?

Selection Bias

500

One of the functions of this section, among others, is to build the scaffolding

Rising Action