Reputation: 334
A question in regards to Dimensional Modelling and Role Playing. We have an Address dimension which is ‘role playing’. We receive Addresses from different sources including CRM systems. Addresses could also be of different types, such as Address of a company, individual etc. So from the Role Playing Address dimension, a single address could be tagged as the Address of a company and Address for billing in different facts.
There are different fact tables and they have different keys which would hold address data. Fact_Sales would have keys such as Customer_Address_Key, Company_Head_Office_Address_Key. So I believe we are kind of role playing the addresses in these facts.
Question:
Our lead Data Architect has a concern around this. • We are capturing a lot of addresses from a number of systems. How would we identify where these addresses came from, and what type of addresses are these without going to the fact tables.
I would still suggest going through the facts, but I would like to consult the wider community over there before putting my feet firmly on the ground.
Is there any better way to do this, perhaps a separate table which defines the combination of Address_Key, Address_Type_Key and Source_Key.
Please let me know if you need any further clarification or pictures etc.
Cheers Nithin
Upvotes: 0
Views: 1984
Reputation: 4439
It is very reasonable to include source as an attribute of the dimension. The bigger question is how do you select the "Current" address for a customer if you have multiple sources. That is where things will get tricky.
You need Current Customer Address to mean the same thing throughout your business regardless of the source from which it was captured. I would refer to this as a conformed dimension. You need to 'conform' all of your addresses sources to the same structure so you can use them as a single dimension.
In the large majority of your facts, the source of the address is irrelevant. You are only needing to know that it is the current address. You may have a smaller model that can provide analysis on the source of the customer address.
The hard part is deciding which source is most trustworthy when the address is in multiple sources. You need to consider the source and the date of the last update. In other words, is the primary source still preferred when a less trustworthy source has a more recent update.
Type is usually just an attribute of the address. However, if your address can be used for multiple things (physical, shipping, billing, etc), that may need to be defined by the role-playing relationship. For other analytics on address, you can break city/state & zip into separate dimensions if you need to break things down by geographic location. I would recommend City & State be used as a single entity. If you treat City as separate from State, you'll get funny results when slicing by cities that exist in more than one state.
Upvotes: -1
Reputation: 2279
It sounds like in the situation you have that you should just include columns for the type of address and the source of the address in the address dimension itself, so it stands alone and you don't have to go via a fact to know what kind of thing it is. You wouldn't need a separate table with keys as you mentioned- the data can safely be denormalised in the dimension.
As an aside:
Although many people do have an address table which is separate, the approach from the Kimball Group would not be to have have 'address' or location dimension as a multi purpose dimension that stands alone- it provides part of what describes something else (like a company, or a customer, or even a 'delivery location'). Instead you'd have the dimension (e.g Customer) and Within that dimension you'd have a number of Address fields, named appropriately (CustomerAddress1, CustomerAddress2, CustomerCity). You may choose to administer the address centrally for convenience behind the scenes, with the other dimensions formed by means of views or further ETL, but in the presentation of the star schema the address table would not be seen separately. The addresses are still conformed in that they're called the same thing and mean the same thing.
However plenty of people go with a separate Address table as you've done
Upvotes: 2