Linear Regression
Observational Studies
Experiments
Halloween
Random
100

This is the variable that represents the predicted y value 

What is y hat

100

Your friend says that since people seem to be carrying umbrellas every time it is raining, then there is statistical evidence that umbrellas cause rain. This is what you would say to your friend

What is no. Correlation =/= causation

100

I am testing if reading a bedtime story every night to plants makes them grow faster. I read to 10 plants and don't read to another 10 plants. This is the treatment in the experiment

What is reading to the plants every night?

100

Out of the following, this is the fake cereal:

-- Count Chocula

-- Skelly Crunch 

-- Boo Berry

-- Yummy Mummy

What is Skelly Crunch

100

Pick a number 1-4

Congratulations! You have stolen 200 points from that team (if this was your own team, your score does not change).

200

This is what LSRL stands for

What is least squares regression line

200

This is the name of a hidden variable that could actually be responsible for the correlation between two other variables

What is a confounding variable

200

I am testing a new medication to treat the condition where people think cilantro tastes bad. Experimenters give test subjects (a group of people who have this condition) one of two pills-- the cilantro cure or a placebo. Experimenters then ask the test subjects to eat cilantro and rate how bad it tastes on a scale of 1-5. This is what I would have to do to make sure this experiment is double blind

What is neither experimenters nor experimental units (test subjects) know who got the real cilantro pill

200

Transylvania is a region in this country

What is Romania

200

Each team nominate one person

Clap when I clap. Whoever is left at the end earns their team 500 points.
300

Interpret the following: the residual for Dracula's test score was -12 points.

What is Dracula's test score was 12 points lower than predicted by the LSRL

300

I want to estimate the average height of a Webb student. Luckily, everyone is lined up for a group picture in ten rows. I sample by randomly choosing two rows and measuring the heights of every student in that row. Explain the bias (or lack thereof) in this sampling method.

What is there is no bias... it is imprecise, but overestimating and underestimating are equally likely. Repeating this study enough times will yield an accurate result. This is a cluster sample.

300

This is the name of a block design where the blocks are of size two.

What is matched pairs?

300

Daily Double! You can risk up to all of your points or 1000 (whichever is greater)

This is Shaggy's last name (from Scooby Doo)

300

Pick a number 1-100

If you picked an even number, your score goes down by 300 points. If you picked an odd number, your score goes up by 300 points

400

Daily Double! You can risk up to all of your points or 1000 (whichever is greater)

In terms of training and testing data, explain the term overfitting in machine learning

400

The math department wants to see how stressed the typical math student at Webb is about their coursework. However, they want to make sure all math classes are represented, so they randomly select 5 students from each math class-- making the study an example of this sampling method.

What is stratified random sampling

400

Criss Angel says he is able to read your mind and guess what card you are thinking of. We run an experiment where you think of a card and he guesses, and repeat 100 times. While he didn't get all of them correct, we end up with a result that is "statistically significant." In this context, what does "statistically significant" mean?

What is...

it is unlikely that Criss Angel would have guessed your card correctly the number of times that he did purely by chance

400

This is the difference between a ghost and a ghoul

What is... the main difference is that ghosts are non-physical apparitions, but ghouls have a physical body. Ghouls are a reanimated corpse similar to a zombie, while ghosts are purely spirits of the dead.

400

This is what random.org uses to generate its random numbers

What is atmospheric noise (radio static also accepted)

500

The LSRL on a scatterplot of number of pumpkin seeds vs. pumpkin volume has an r value of .68. Find and interpret r^2.

What is 46.24% of the variation in number of pumpkin seeds can be explained/predicted by its linear relationship with pumpkin volume

500

This is a situation where a systematic random sample might be used, and this is how it would be set up.

Answers vary... any situation where things are naturally "lined up" in some way. Example:

What is exit polling-- ask every tenth person leaving the voting area who they voted for

500

I took a random sample of Webb boarding students and measured how often they go to breakfast. I also looked at their grades in SIS. I plotted their grades against their breakfast frequency and saw a strong positive relationship between the two. This is the exact conclusion I am able to draw.

What is there is a statistical link between grades and frequency of eating breakfast for Webb boarding students

500

This is the only standard M&M color that is not a standard Skittles color

What is brown

500

This is an example of a pseudorandom number generation method

Multiple answers accepted, most likely would be... What is use an irrational number?

M
e
n
u