Fact Table with Primary Key and Foreign Keys

Question

I have been going through Kimball-The-Data warehouse-Tookit 3rd edition to understand about facts and dimension; I couldn't understand the following paragraph information. Would anyone please give me possible understanding:

The fact table generally has its own primary key composed of a subset of the foreign keys. This key is often called a composite key. Every table that has a composite key is a fact table. Fact tables express many-to-many relationships. All others are dimension tables. There are usually a handful of dimensions that together uniquely identify each fact table row. After this subset of the overall dimension list has been identified, the rest of the dimensions take on a single value in the context of the fact table row’s primary key. In other words, they go along for the ride.

Thank you for giving your valuable time.

Jayvee · Accepted Answer

To completely understand this we may need to break it down and help us with a couple of pictures.

Let's start by the first definition:

The fact table generally has its own primary key composed of a subset of the foreign keys. This key is often called a composite key.

In this very simple diagram we can see that there is a fact table (Fact_Sales) which hasn't got its own primary key per se but a set of foreign keys pointing to the dimension tables (which do have primary keys). This set of foreign keys in sales (Date_Id/Store_Id/Product_Id) is a composite key, acting as the primary key of the fact table. Which bring us to the definition:

Every table that has a composite key is a fact table.

For the next definition:

Fact tables express many-to-many relationships

Again, as it can be seen in the ERD of the example, the fact table Fact_Sales can also be seen as a junction table expressing a many-to-many relationship between Dim_Store and Dim_Product, i.e a store can sell many products and a product can be sold in many stores.

Now, if a table is not a Fact table as defined above then we come to the next definition:

All others are dimension tables

Ok, so far so good so let's carry on with the subsequent definitions:

There are usually a handful of dimensions that together uniquely identify each fact table row.

As we make our model more complex we may reach the point where the set of foreign keys of a fact becomes a unique key. For example if we'd added another dimension Dim_Customer and therefore a new foreign key to Fact_Sales we'll then have the composite key as (Date_Id/Store_Id/Product_Id/Customer_Id), also lets say we add time to the dimension date (this could bring a separate discussion but for the sake of this analysis let's assume we have it in date dimension) then this composite key would be unique, since is not possible to sell the same thing to the same customer in the same store at the same time.

Finally, is time to extend our model a bit further so we are moving to a snowflake schema like this:

This last sentence:

After this subset of the overall dimension list has been identified, the rest of the dimensions take on a single value in the context of the fact table row’s primary key. In other words, they go along for the ride.

means that once we identify the dimensions corresponding to a fact through its foreign keys (fact table row's primary key), for example in the above diagram these would be Date_Id/Store_Id/Product_Id, the other dimensions get uniquely identified by the dimension tables foreign keys; for instance, Dim_Geography is "going along for the ride", invited by Dim_Store through its foreign key Geography_Id

Image Attribution: SqlPac at English Wikipedia, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

Fact Table with Primary Key and Foreign Keys

Answers (1)

Related Questions