What is a Derived Schema?
A 'secondary' layer to the dimensional architecture that piggybacks on existing dimensional structures rather than getting data from original sources.
What is a Merged Fact table?
A fact table that combines facts from across multiple stars.
What is a pivoted fact table?
What is a Sliced fact table?
A sliced fact table is exactly the same as the original star but only contains a subset of rows.
Sliced fact tables are typically defined using a specific dimension attribute and may be useful in providing distributed applications, enforcing role-based security, or in reducing schema scope for use in an OLAP tool.
What are set operations?
Set operations let us compare 2 star schemas as input and produce an output based on the set of operations or a sub query.
How does a Derived Schema effect query performance?
Data is reorganized into a format that answers a specific question.
What is a use for a merged fact table?
To compare plans or goals to actual performance.
Why pivot a fact table?
to more easily compare and analyze different attributes or properties of the measures across dimensions.
How do you create slices of a star schema?
A sliced fact table is derived from a star by selecting all rows that refer to a common dimension value. The resulting star has the same attributes as the original but may be significantly smaller.
When would we use set operations?
In instances where we want to record rows only where an activity or transaction occurs rather than record a row for every possible combination of dimension values. They can store aggregated data from the result of unions, differences, etc.
What is an example of a Derived Schema already covered?
Snapshot.
Accumulating Snapshot.
Core Fact Tables.
What does a non-shared dimension do?
It can allow you to filter one of the star schemas in the merged fact with a dimension table that isn't shared by the merged facts.
Give an example of when you might need to pivot a fact table.
In a normal fact table, measures such as "units sold", "revenue", and "profit" would be represented by individual columns. You could pivot the table to represent monthly profit, annual units sold, product categories, etc.
What can we use a sliced fact for?
Most commonly, focuses on reginal or departmental data. Slices also limit the scope of the data set without sacrificing detail, so there will be an easier time deploying them across locations.
What are some examples of set operations?
Order facts that only report orders when they take place.
A fact table containing customers to salespeople when assigning of customers to salespeople.
What are the 2 major costs of a Derived Schema?
More costly ETL processing.
Multiple Star schemas to choose from.
How can you support BI tools that cannot drill across with a Merged Fact table?
By design, merged fact tables allow you to access data from multiple facts without the need to drill across.
What is an advantage and a drawback of pivoting a fact table?
Advantage: There is no need to use complex or complicated SQL queries to measure data in ways not original to the other star schema.
Drawback: Maintaining a pivoted schema can be complicated and difficult for developers or analysts not familiar with the schema or how pivoted facts work.
Give an example of a sliced star schema.
We have a star schema that tracks orders by day, product, salesperson, and region. One of the attributes of the salesperson dimension table is region. This dimension takes on values such as “East,” “West,” and so forth. Order_facts contains orders for all regions.
Why precompute set operations?
Precomputing the results of set operations is most often useful with intersect and minus operations and almost always involves a coverage table. This can help simplify queries later on in the process, offload complex queries to 'lower' tables that may not need processed as much, and improve performance.
How would a Derived Schema effect Reports?
A derived schema will make it easier to write complex reports and lower report query times since you are able to design a Derived Schema to ask/solve specific questions.
What would a merge fact table of the following contain?
What would a Pivoted fact table look like for the following star?
How would you combine slices to derive a whole fact?
Union all the slices taking care to keep architecture across the ETL consistent and not overlapping the slices.
What is an example of a use case for a Precomputed Set Operation?
We have a fact called fact_sales that contains data for a chain or retails stores about each sale, including the store location, the product sold, the date of the sale, and the revenue generated.
We want to answer the question "what is the total revenue generated by all stores int he North region for Q1 of last year?"
We could use a precomputed set operation fact table called "quarterly_sales_by_region" that contains pre-aggregated data for each region and quarter to make the access of this data faster, rather than having to query each star with costly joins and aggregations, as well as reduce storage requirements.