Reputation: 8278
I am reading about DW modeling and started wondering why surrogate keys are used at all?
I understand that sometimes business keys are not integers nthat makes the life (as well as joiing and indexing) harder. However, what I do not understand is why to solve kinda limitation of the DW or RDBMS by adding and extra column for managing unique identifiers?
Would it not be more appropriate that this kind of functionality would be transparent to DW/RDBMS users and the entry will get internal identifier from the system automatically? For example creating an SHA-1 digest of the entire row or a subset of it (those fields that can be represented in some kind of a textual format).
Upvotes: 0
Views: 1231
Reputation: 12824
The reason to use surrogate keys is because you have control over the data warehouse but most likely do not have control over the source systems. Assumptions you make today about the stability of the natural keys can cause you problems in the future.
Issues you may run into by not using your own surrogate key:
Employee
table with an auto incrementing integer used as the key. Each company will have an Employee #1. The DW warehouse will need a surrogate key to distinguish the two people who share the same ID.Upvotes: 2