State the grain of a transaction fact in one sentence.
One row per business event at the moment it occurs (e.g., order line, journal line) with event timestamp and keys.
Define a conformed dimension
A standardized, shared dimension used by multiple facts with consistent keys/attributes.
Difference between a periodic snapshot and a transaction fact.
Snapshot records state at intervals; transaction records events as they happen.
Distinguish a metric from a dimension attribute
Metric = numeric measure to aggregate; attribute = descriptor for slicing/filtering.
Name one row‑level and one aggregate validation to run nightly.
Row: not‑null, referential integrity; Aggregate: record counts or totals by period vs source.
Match each fact type to an example grain: Transaction, Periodic Snapshot, Accumulating Snapshot.
Transaction → per event (e.g., order line). Periodic Snapshot → per entity per period (e.g., account‑month). Accumulating Snapshot → per process instance across milestones (e.g., order from placed→shipped→delivered).
Why use surrogate keys in dimensions?
Stability across source changes, enable SCD versions, decouple natural key volatility.
When is SCD Type 1 preferable to Type 2?
For non‑audited corrections or where only current value matters (e.g., standardized name).
Name two artifacts that tame conflicting business terms.
Business glossary with owners and semantic/BI views reflecting agreed rules.
What belongs in UAT acceptance criteria for a new dashboard?
DoD, named test cases with expected values, data currency/SLA, and sign‑off owner.
Your reporting shows double counts after adding cancellations. What grain change most likely fixes it?
Model at the line‑level event grain (separate events for create/cancel/return) or use an accumulating snapshot with milestone flags instead of aggregating at the order header.
What structure supports rollups across ragged or dated hierarchies?
A hierarchy bridge (parent‑child with effective dates or path enumeration).
Name one reliable way to detect attribute change for SCD2.
Compare a hash of attributes (or column‑by‑column) and start a new version when it differs.
Two discovery questions that lock the fact design.
“One row per what?” and “When is the event measured?” (plus ‘how are updates handled?’ acceptable).
Two tests that catch broken joins in a transactional fact.
Orphan foreign keys check; double‑count guardrails (totals vs source); scan for duplicate natural keys.
Name two symptoms of the wrong fact grain.
Counts drift with joins; totals vary with dimension filters; hard to reconcile to source; duplicates after reloads.
How do you model many‑to‑many relationships from facts to multi‑valued dimensions?
Use a bridge table (fact → bridge → dimension) with allocation weights if needed.
Give a grain and two milestone examples for an accumulating snapshot.
Grain: one row per process instance; milestones: e.g., submitted, approved, completed; store dates and cycle times.
Which artifact keeps sprint scope stable while saving ideas for later?
Sprint backlog for now; product backlog/parking lot for later.
Which three metadata fields make audits happier?
Source system/table, load timestamp/batch id, transform version (e.g., git hash).