Data Types/Structures
Probability Playground
Visualize This
Hypothesis Heroes
UC Berkeley Trivia
100

What type of data structure is this? structure = [1, 3, 5, 10]

What is a list?

100

A bag contains 2 red, 2 blue, and 1 green marble. You pick one at random.

What is the probability it is not green?

What is 4/5?

100

You have a DataFrame df with a numerical column height. We can use a _____ to visualize the distribution of height values into various bins/buckets?

What is a histogram

100

The hypothesis claims that any difference you see in the data is just random chance.

What is the null hypothesis

100

Fill in the blank: UC Berkeley is the _____ ranked public institution in the United States.  

What is best/first/#1

200

What type of data structure is s? s = df[“names”]. Assume that df is a dataframe.

What is a series

200

You roll a fair six-sided die. What is the probability of rolling a number greater than 2?

What is 2/3

200

You have a DataFrame df with a categorical column genre. Which type of plot shows how many counts there are of each genre?

What is a bar/count plot

200

This hypothesis is the one you look for evidence in favor of– usually that there is a difference or effect.

What is the alternative hypothesis

200

What is UC Berkeley’s iconic tower called?

What is The Campanile

300

If we had a starting value, end value, as well as a step size to iterate over some data, what data type would we use?

What is a range

300

What rule is this called: P(A U B) = P(A) + P(B) - P(A ∪ B)

What is the addition rule

300

You have a dataframe containing the following columns: name, student ID, age, gender, test score, and height. How many numerical variables are there?

What is 3

300

This statistic tells you what proportion of simulated samples produced a test statistic as extreme as the one you actually observed (assuming null is true)

What is the p-value

300

What is the name of UC Berkeley’s mascot?

What is Oski

400

What type of data structure is “something”? something = df[[“A_Column”]]

What is a dataframe? 
400

You draw one card from a deck with cards labeled 1, 2, 3, 4, 5. You're told the card is odd. What is the probability the card is 5?

What is 1/3

400

You make a box plot of age for animals. The box is very tall and there are several dots far above the top whisker. What do the dots represent?

What are outliers

400

This value is calculated from your sample and used to measure how far the data deviates from what the null predicts.

What is a test statistic

400

What is UC Berkeley’s #1 most enrolled major?

What is Data Science

500

What type of data structure is this? name_to_age = {“John”: 40, “Jake”: 25, “Tim”: 30}

What is a dictionary

500

Out of a standard deck of 52 playing cards (13 of each suite), what is the probability that given the first card you drew is the ace of hearts, that the next card is either another ace OR a spade?

What is 15/51

500

Give two reasons why the following line of code is wrong to generate a scatter plot: 

sns.scatter(data=df, x=”age”)

What is not using the right function name (scatterplot) and not providing the other "y variable" (2 dimensional)?  

500

Suppose after doing a hypothesis test you obtain a p-value of 0.03 and your p-value cutoff is 0.05. What conclusion do you make? 

What is rejecting the null hypothesis

500

Name any 3 famous UC Berkeley alumni.

ANY 3 CORRECT RESPONSES (VERIFY)