This USC quarterback won the Heisman Trophy in 2002 and had a successful NFL career, particularly with the Bengals, Raiders, and Arizona Cardinals
Carson Palmer
Which is the newest building on campus? (+20 points for full name)
Dr. Allen and Charlotte Ginsburg Human-Centered Computation Hall
This Python data structure is ordered, allows duplicate elements, and can store multiple values of different data types
Name three commonly used software tools for data visualization
Any 3: Matplotlib, Seaborn, Tableau, Power BI, Looker Studio, Spotfire, Sisense, Qlik
This type of machine learning involves labeled data and is used for tasks like classification and regression
Supervised learning
Question: This former USC running back was inducted into the Pro Football Hall of Fame in 1985 and had a highly publicized legal case in the 1990s
O.J. Simpson
WHICH USC COACH TRANSFORMED THE FOOTBALL TEAM INTO NATIONAL CONTENDERS IN THE 2000s?
Pete Carrol
This statistical term refers to the measure of how spread out the values in a dataset are, often calculated as the square root of the variance
Standard deviation
This type of database management system (DBMS) stores data in tables and is accessed using Structured Query Language (SQL)
relational database
This metric is used to measure the accuracy of a regression model by calculating the average of the squared differences between actual and predicted values
Mean squared error (MSE)
Question: This USC alumnus is one of the highest-grossing directors of all time, known for directing films like Star Wars and Indiana Jones
George Lucas
WHICH FAMOUS FILMMAKER DONATED TO USC IN 2006?
George Lucas
This technique in data preprocessing involves replacing missing values in a dataset with the mean, median, or mode of that feature
Imputation
Imputation is a statistical technique used to fill in missing data points in a dataset. The goal is to replace missing values with substituted values, which can help maintain the integrity of the dataset for analysis and modeling. Common methods for imputation include replacing missing values with the mean, median, or mode of the respective column.
Here’s a simple example using the pandas
library in Python to perform imputation on a DataFrame:
import pandas as pd
import numpy as np
# Create a sample DataFrame with missing values
data = {
'A': [1, 2, np.nan, 4],
'B': [5, np.nan, np.nan, 8],
'C': [10, 11, 12, 13]
}
df = pd.DataFrame(data)
# Impute missing values with the mean of each column
df_imputed = df.fillna(df.mean())
print("Original DataFrame:")
print(df)
print("\nDataFrame after imputation:")
print(df_imputed)
df
with missing values (represented as np.nan
).fillna()
method is used to replace the missing values with the mean of each column.By using imputation, you can ensure that your dataset remains useful for analysis without losing valuable information due to missing values.
This non-relational database is known for storing data in a flexible, JSON-like format and is often used for handling unstructured data
MongoDB
The ________ is a performance measure for classification models that calculates the harmonic mean of precision and recall, providing a balance between the two metrics, especially in cases of class imbalance
F1 score
This USC dorm was featured in The Office in a scene where Pam went to art school
Pardee Tower
HOW MANY HEISMAN TROPHY WINNERS HAVE COME FROM USC?
8
This type of function in Python is often used for small, one-off operations and doesn't require a formal def statement. It's typically written in a single line and used in cases like sorting or filtering
lambda function
Question: This command in SQL is used to combine rows from two or more tables based on a related column between them
JOIN
Using scikit-learn, write a single line of code to split the dataset X and y into training and test sets with 80% of the data for training and a random seed of 42
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.2, random_state=42)
This USC alumna is the most decorated female track and field athlete in Olympic history, with a total of 11 Olympic medals, including 7 golds
Allyson Felix
How many fountains do we have on campus? (±2)
38
In Python, this built-in function allows you to iterate over two or more iterables (like lists or tuples) in parallel, returning tuples containing elements from each iterable at the same position
zip()
Write an SQL query to retrieve all columns from a table named Employees where the employee's department is 'Sales'
SELECT * FROM Employees WHERE department = 'Sales';
Convert the following mathematical function into a NumPy expression
np.sum(3 * np.square(x) + 2 * x + 1)