What type of data structure is this? structure = [1, 3, 5, 10]
What is a list?
A bag contains 2 red, 2 blue, and 1 green marble. You pick one at random.
What is the probability it is not green?
What is 4/5?
You have a DataFrame df with a numerical column height. We can use a _____ to visualize the distribution of height values into various bins/buckets?
What is a histogram?
The hypothesis claims that any difference you see in the data is just random chance.
What is the null hypothesis?
Fill in the blank: UC Berkeley is the _____ ranked public institution in the United States.
What is best/first/#1?
What type of data structure is s? s = df[“names”]. Assume that df is a dataframe.
What is a series?
You roll a fair six-sided die. What is the probability of rolling a number greater than 2?
What is 2/3?
You have a DataFrame df with a categorical column genre. Which type of plot shows how many counts there are of each genre?
What is a bar/count plot?
This hypothesis is the one you look for evidence in favor of– usually that there is a difference or effect.
What is the alternative hypothesis?
What is UC Berkeley’s iconic tower called?
What is The Campanile?
If we had a starting value, end value, as well as a step size to iterate over some data, what data type would we use?
What is a range?
What rule is this called: P(A U B) = P(A) + P(B) - P(A ∪ B)
What is the addition rule?
You have a dataframe containing the following columns: name, student ID, age, gender, test score, and height. How many numerical variables are there?
What is 3?
This statistic tells you what proportion of simulated samples produced a test statistic as extreme as the one you actually observed (assuming null is true)
What is the p-value?
What is the name of UC Berkeley’s mascot?
What is Oski?
What type of data structure is “something”? something = df[[“A_Column”]]
You draw one card from a deck with cards labeled 1, 2, 3, 4, 5. You're told the card is odd. What is the probability the card is 5?
What is 1/3?
You make a box plot of age for animals. The box is very tall and there are several dots far above the top whisker. What do the dots represent?
What are outliers?
This value is calculated from your sample and used to measure how far the data deviates from what the null predicts.
What is a test statistic?
What is UC Berkeley’s #1 most enrolled major?
What is Data Science?
What type of data structure is this? name_to_age = {“John”: 40, “Jake”: 25, “Tim”: 30}
What is a dictionary?
Out of a standard deck of 52 playing cards (13 of each suite), what is the probability that given the first card you drew is the ace of hearts, that the next card is either another ace OR a spade?
What is 15/51?
Give two reasons why the following line of code is wrong to generate a scatter plot:
sns.scatter(data=df, x=”age”)
What is not using the right function name (scatterplot) and not providing the other "y variable" (2 dimensional)?
Suppose after doing a hypothesis test you obtain a p-value of 0.03 and your p-value cutoff is 0.05. What conclusion do you make?
What is rejecting the null hypothesis?
Name any 3 famous UC Berkeley alumni.
ANY 3 CORRECT RESPONSES (VERIFY)