This type of graph, where we compare two quantitative variables, allows us to do regression analysis
Scatterplot
When we take a small subset of our data to perform inferential statistics, we are looking at this
Sample
The way humans interact with Generative AI is called this
Prompt
When telling a story with data, you need to have these things
Data, Narrative, and Visualizations
The most important findings of your analysis should be explained in this portion of the narrative
Climax
Correlation
This is the procedure we perform when we are looking to see if our sample is different from the population
Hypothesis Test
A finite sequence of well-defined instructions used to solve a computational problem is known as this
Algorithm
The "So-What" and thesis portions of a data story both comprise this element of the story
Main Point
The "So-What" statement and thesis of the narrative should be in this portion of the narrative
Initiating Event
When we examine point on a scatterplot that are far away from the line of best fit, what type of points are we studying?
Influential Points
This quantity is what we use to determine statistical significance
P-value
Type of algorithm that is mainly looking at natural patterns of data as opposed to fitting a model is known as this
Unsupervised Learning
This phenomenon is experienced when two things happen to be correlated with each other by chance despite being unrelated to each other
Spurious Correlation
We make final recommendations to stakeholders in this portion of the narrative
Resolution / Conclusion
Mike is studying the relationship between high temperatures and turkey sales. He calculated the correlation as -0.98. Mike concludes that the lower temperatures causes turkey sales to increase. Is he correct? Why or why not?
No, he is incorrect because correlation does not imply causation
We compare the p-value to this quantity when we determine statistical significance
Significance level
Rich is creating an AI model that aims to predict with a high degree of accuracy the time it takes to drive to San Diego on any given day. What type of model should he use?
Neural Network
Psychology of Data
We perform descriptive analytics in this section of the narrative
Exposition
Stephanie is the leader of her local rocketry club. She measured the position and acceleration of her rocket at launch and calculated the correlation to be 0. What can she conclude?
Position and acceleration have no linear relationship. However, they are related in a different way
Kermit wanted to compare whether a new design of a lug nut for his car tires is significantly better than the conventional design. He obtained a p-value of 0.44. Assuming the significance level is 0.05, what should he conclude?
Fail to reject null. Not enough evidence to say new lug nut is better than conventional one
Think back to our case study. Why did we pick the linear regression model out of all the others?
We wanted to have high interpretability (see the specific parameters). Also, we are predicting a value instead of classifying (eliminates logistic regression)
Neil performed a study on whether caffeine improves how someone drives a car. He only chose Mountain Dew drinkers in his study to compare with non-caffeine consumers. What kind of bias is his study most likely experiencing?
Selection Bias
One of the functions of this section, among others, is to build the scaffolding
Rising Action