Sports & Pop Culture
USC
Programming
Data Science
Machine Learning
100

This USC quarterback won the Heisman Trophy in 2002 and had a successful NFL career, particularly with the Bengals, Raiders, and Arizona Cardinals

Carson Palmer

100

Which is the newest building on campus? (+20 points for full name)

Dr. Allen and Charlotte Ginsburg Human-Centered Computation Hall

100

This Python data structure is ordered, allows duplicate elements, and can store multiple values of different data types

List
100

Name three commonly used software tools for data visualization

Any 3: Matplotlib, Seaborn, Tableau, Power BI, Looker Studio, Spotfire, Sisense, Qlik

100

This type of machine learning involves labeled data and is used for tasks like classification and regression

Supervised learning

200

Question:
This former USC running back was inducted into the Pro Football Hall of Fame in 1985 and had a highly publicized legal case in the 1990s

O.J. Simpson

200

WHICH USC COACH TRANSFORMED THE FOOTBALL TEAM INTO NATIONAL CONTENDERS IN THE 2000s?

Pete Carrol

200

This statistical term refers to the measure of how spread out the values in a dataset are, often calculated as the square root of the variance

Standard deviation

200

This type of database management system (DBMS) stores data in tables and is accessed using Structured Query Language (SQL)

relational database

200

This metric is used to measure the accuracy of a regression model by calculating the average of the squared differences between actual and predicted values

Mean squared error (MSE)

300

Question:
This USC alumnus is one of the highest-grossing directors of all time, known for directing films like Star Wars and Indiana Jones

George Lucas

300

WHICH FAMOUS FILMMAKER DONATED TO USC IN 2006?

George Lucas

300

This technique in data preprocessing involves replacing missing values in a dataset with the mean, median, or mode of that feature

Imputation

Imputation is a statistical technique used to fill in missing data points in a dataset. The goal is to replace missing values with substituted values, which can help maintain the integrity of the dataset for analysis and modeling. Common methods for imputation include replacing missing values with the mean, median, or mode of the respective column.

Example Code

Here’s a simple example using the pandas library in Python to perform imputation on a DataFrame:

import pandas as pd
import numpy as np

# Create a sample DataFrame with missing values
data = {
    'A': [1, 2, np.nan, 4],
    'B': [5, np.nan, np.nan, 8],
    'C': [10, 11, 12, 13]
}
df = pd.DataFrame(data)

# Impute missing values with the mean of each column
df_imputed = df.fillna(df.mean())

print("Original DataFrame:")
print(df)
print("\nDataFrame after imputation:")
print(df_imputed)

Explanation:

  • This code creates a DataFrame df with missing values (represented as np.nan).
  • The fillna() method is used to replace the missing values with the mean of each column.
  • The original DataFrame and the imputed DataFrame are printed to show the difference.

By using imputation, you can ensure that your dataset remains useful for analysis without losing valuable information due to missing values.

300

This non-relational database is known for storing data in a flexible, JSON-like format and is often used for handling unstructured data

MongoDB

300

The ________ is a performance measure for classification models that calculates the harmonic mean of precision and recall, providing a balance between the two metrics, especially in cases of class imbalance

F1 score

400

This USC dorm was featured in The Office in a scene where Pam went to art school

Pardee Tower

400

HOW MANY HEISMAN TROPHY WINNERS HAVE COME FROM USC?

8

400

This type of function in Python is often used for small, one-off operations and doesn't require a formal def statement. It's typically written in a single line and used in cases like sorting or filtering

lambda function

400

Question:
This command in SQL is used to combine rows from two or more tables based on a related column between them

JOIN

400

Using scikit-learn, write a single line of code to split the dataset X and y into training and test sets with 80% of the data for training and a random seed of 42

X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.2, random_state=42)

500

This USC alumna is the most decorated female track and field athlete in Olympic history, with a total of 11 Olympic medals, including 7 golds

Allyson Felix

500

How many fountains do we have on campus? (±2)

38

500

In Python, this built-in function allows you to iterate over two or more iterables (like lists or tuples) in parallel, returning tuples containing elements from each iterable at the same position

zip()

500

Write an SQL query to retrieve all columns from a table named Employees where the employee's department is 'Sales'

SELECT * FROM Employees WHERE department = 'Sales';

500

Convert the following mathematical function into a NumPy expression

 np.sum(3 * np.square(x) + 2 * x + 1)

M
e
n
u