Join Forces
Aggregate State
Partition Impossible
SQL Me This?
The ETL Strikes Back
100

You’re joining a customers table to a transactions table and want to include all customers, even those without any matching transactions. This type of join ensures all customers appear in the result.

What is a LEFT JOIN?

100

This aggregate function returns the first non-null value from a group of rows based on order.

What is FIRST_VALUE()?

100

In SQL, this essential clause is required after ROW_NUMBER(), LEAD(), or NTILE() to define how rows are grouped and ordered.

What is the OVER clause?

100

Riddle: I keep duplicates out and make your dataset lean. I go by one name, but in truth I'm quite mean.

What is DISTINCT?

100

This type of derived table in Looker is built once and stored in the database, unlike regular derived tables which are recomputed at query time.

What is a Persistent Derived Table (PDT)?

200

If you forget to include join keys, this type of join will return a Cartesian product.

What is a CROSS JOIN?

200

In written SQL syntax, these clauses follow a specific order, starting with SELECT and ending with LIMIT.

(LIMIT, WHERE, GROUP BY, SELECT, JOIN, HAVING, QUALIFY, FROM, and ORDER BY)

What is SELECT → FROM → JOIN → WHERE → GROUP BY → HAVING → QUALIFY → ORDER BY → LIMIT?

200

This window function lets you access the previous row’s value within the same partition.

What is LAG()?

200

Riddle: Some say I’m a condition, but I’m really a vibe. When you want to filter groups, I’ll help you decide.

What is HAVING?

200

In LookML, this parameter connects a derived table’s refresh logic to a defined caching policy, allowing the table to rebuild only when the associated freshness condition is met.

What is datagroup_trigger?


300

This type of join returns unmatched rows from both joined tables, filling in NULLs where data is missing on either side.

What is a full outer join?

300

This aggregate function in Snowflake combines distinct values from a column into a single comma-separated string.

What is LISTAGG(DISTINCT column, ',')?

300

This type of Snowflake object allows results of complex queries to be cached for reuse, improving performance across sessions.

What is a materialized view?

300

Riddle: I come at the end but affect what you see. Sort by me, and your data flows like tea.

What is ORDER BY?

300

In Looker, this parameter applies a required filter through the UI, while its SQL counterpart applies the constraint invisibly in the SQL regardless of user input.

What is the difference between always_filter and sql_always_where?

400

You need to generate a report that shows every product for every day of the year, even when no activity occurred. This type of join will create every possible combination between the products and calendar tables.

What is a CROSS JOIN?

400

This clause can be used in Snowflake to filter rows after applying a window function, without relying on HAVING.

What is QUALIFY?

400

Using this pattern in a filter disables pruning and results in full table scans with expensive string comparisons in Snowflake.

What is ILIKE '%term%'?

400

Riddle: I can be correlated or subtracted. Put me in a FROM or SELECT, and you’ll feel impacted.

What is a Subquery?

400

These are the four valid values for the materialized config in a dbt model.

What are view, table, incremental, and ephemeral?

500

This trio of keys includes one that uniquely identifies a row, one that links tables, and one that is system-generated, often numeric or a UUID.

What are a primary key, a foreign key, and a surrogate key?

500

This is the logical execution order of SQL clauses from data source access to final output, not to be confused with written syntax order.

(LIMIT, WHERE, GROUP BY, SELECT, JOIN, HAVING, QUALIFY, FROM, and ORDER BY)

What is FROM → JOIN → WHERE → GROUP BY → HAVING → QUALIFY → SELECT → ORDER BY → LIMIT?

500

These two functions differ in how they handle ties: one skips rank numbers, the other doesn’t.

What are RANK() and DENSE_RANK()?

500

Riddle: I let you peek at your neighbor, without needing to scroll. Forward or backward, I give you control.

What is LAG() / LEAD()?

500

In dbt, this function references another dbt model, while its counterpart is used to reference raw source data.

What are ref() and source()?