Chapter 14!

Derived Schemas

Merged Fact Tables

The Pivoted Fact Tabe

Sliced Fact Table

Set Operation Fact Tables

100

What is a Derived Schema?

A 'secondary' layer to the dimensional architecture that piggybacks on existing dimensional structures rather than getting data from original sources.

100

What is a Merged Fact table?

A fact table that combines facts from across multiple stars.

100

What is a pivoted fact table?

Data from an original fact table is transposed from a row-wise orientation to a column wise orientation, or vice versa to simplify certain forms of reporting.

100

What is a Sliced fact table?

A sliced fact table is exactly the same as the original star but only contains a subset of rows.

Sliced fact tables are typically defined using a specific dimension attribute and may be useful in providing distributed applications, enforcing role-based security, or in reducing schema scope for use in an OLAP tool.

100

What are set operations?

Set operations let us compare 2 star schemas as input and produce an output based on the set of operations or a sub query.

200

How does a Derived Schema effect query performance?

Data is reorganized into a format that answers a specific question.

200

What is a use for a merged fact table?

To compare plans or goals to actual performance.

200

Why pivot a fact table?

to more easily compare and analyze different attributes or properties of the measures across dimensions.

200

How do you create slices of a star schema?

A sliced fact table is derived from a star by selecting all rows that refer to a common dimension value. The resulting star has the same attributes as the original but may be significantly smaller.

200

When would we use set operations?

In instances where we want to record rows only where an activity or transaction occurs rather than record a row for every possible combination of dimension values. They can store aggregated data from the result of unions, differences, etc.

300

What is an example of a Derived Schema already covered?

Snapshot.

Accumulating Snapshot.

Core Fact Tables.

300

What does a non-shared dimension do?

It can allow you to filter one of the star schemas in the merged fact with a dimension table that isn't shared by the merged facts.

300

Give an example of when you might need to pivot a fact table.

In a normal fact table, measures such as "units sold", "revenue", and "profit" would be represented by individual columns. You could pivot the table to represent monthly profit, annual units sold, product categories, etc.

300

What can we use a sliced fact for?

Most commonly, focuses on reginal or departmental data. Slices also limit the scope of the data set without sacrificing detail, so there will be an easier time deploying them across locations.

300

What are some examples of set operations?

Order facts that only report orders when they take place.

A fact table containing customers to salespeople when assigning of customers to salespeople.

400

What are the 2 major costs of a Derived Schema?

More costly ETL processing.

Multiple Star schemas to choose from.

400

How can you support BI tools that cannot drill across with a Merged Fact table?

By design, merged fact tables allow you to access data from multiple facts without the need to drill across.

400

What is an advantage and a drawback of pivoting a fact table?

Advantage: There is no need to use complex or complicated SQL queries to measure data in ways not original to the other star schema.

Drawback: Maintaining a pivoted schema can be complicated and difficult for developers or analysts not familiar with the schema or how pivoted facts work.

400

Give an example of a sliced star schema.

We have a star schema that tracks orders by day, product, salesperson, and region. One of the attributes of the salesperson dimension table is region. This dimension takes on values such as “East,” “West,” and so forth. Order_facts contains orders for all regions.

400

Why precompute set operations?

Precomputing the results of set operations is most often useful with intersect and minus operations and almost always involves a coverage table. This can help simplify queries later on in the process, offload complex queries to 'lower' tables that may not need processed as much, and improve performance.

500

How would a Derived Schema effect Reports?

A derived schema will make it easier to write complex reports and lower report query times since you are able to design a Derived Schema to ask/solve specific questions.

500

What would a merge fact table of the following contain?

https://imgur.com/a/x8nwT7P

https://imgur.com/a/BmEUTwE

500

What would a Pivoted fact table look like for the following star?

https://imgur.com/a/4ovEvOB

https://imgur.com/a/Lp80y4d

500

How would you combine slices to derive a whole fact?

Union all the slices taking care to keep architecture across the ETL consistent and not overlapping the slices.

500

What is an example of a use case for a Precomputed Set Operation?

We have a fact called fact_sales that contains data for a chain or retails stores about each sale, including the store location, the product sold, the date of the sale, and the revenue generated.

We want to answer the question "what is the total revenue generated by all stores int he North region for Q1 of last year?"

We could use a precomputed set operation fact table called "quarterly_sales_by_region" that contains pre-aggregated data for each region and quarter to make the access of this data faster, rather than having to query each star with costly joins and aggregations, as well as reduce storage requirements.