SQL
Python (Pandas)
Basic Stats
Random
100

Select all columns from a table called employees.

SELECT * FROM employees;


100

Read a CSV file called sales.csv into a DataFrame.

import pandas as pd

df = pd.read_csv("sales.csv")


100

What does mean represent in a dataset?

The mean is the average value (sum of all values ÷ number of values).

100

What does DISTINCT do in a query?

Removes duplicate rows for the selected columns, returning unique combinations (e.g., SELECT DISTINCT customer_id gives unique customers).

200

Count how many rows are in the transactions table.

SELECT COUNT(*) FROM transactions;


200

Show the first 10 rows of a DataFrame.

df.head(10)


200

What does median tell you about a dataset?

The median is the middle value when data is ordered; it shows the center without being skewed by outliers.

200

Why might you use a WHERE clause vs. a HAVING clause?

WHERE vs HAVING

  • WHERE filters rows before grouping/aggregation.

  • HAVING filters groups after aggregation

300

Find the total sales per product_id in the sales table.

SELECT product_id, SUM(amount) AS total_sales

FROM sales

GROUP BY product_id;


300

Get the average of the column order_amount.

df["order_amount"].mean()


300

What is standard deviation measuring?

Standard deviation measures how spread out the data is from the mean.

300

What is the largest desert in the world?

Antarctica

400

Return the 5 most recent orders by order_date.

SELECT *

FROM orders

ORDER BY order_date DESC

LIMIT 5;

400

Filter rows where column region equals "West".

df[df["region"] == "West"]

400

What does a correlation coefficient close to 1 mean?

A correlation close to 1 means a strong positive relationship between two variables.

400

What is the only state without a natural lake

Maryland

500

Show the earliest order date for each customer_id in the orders table.

SELECT customer_id, MIN(order_date) AS first_order

FROM orders

GROUP BY customer_id;


500

Group by customer_id and count how many orders they made.

df.groupby("customer_id")["order_id"].count()


500

What does a correlation coefficient near 0 mean about two variables?

no linear relationship — though variables may still have a nonlinear relationship.

500

What is the loudest animal on Earth?

Sperm Whale