Reputation: 785
I've just started learning Data Modeling in DW. And now I'm a bit confused about choosing with field to build a Dimension Table and whether should I split into many DIM tables For example, I have a Employee Table and it's columns as below:
EmployeeID
FirstName
LastName
State
Age
Department
Position
Skin Color
Hair Color
Sale Amount (Fact)
Emp_DIM table content every column above except Sale Amount
and a Fact table that contains
EmployeeID (Surrogate Key)
Sale Amount (Fact)
Ok, put the Date DIM aside. I want to ask about the Emp_DIM table
According to what I read on the internet, the DIM table should have as many fields as possible, and contain descriptive information. But I'm still confused and not sure.
Upvotes: 1
Views: 829
Reputation: 79
The employee dimension and sales fact table that you showed here are intuitive and logical; that's a good start. On the surface I don't see any reason to split those up, as you're just going to create more surrogate keys to manage.
I would only split your dimension if there's a need for a fact to reference those entities separately. Here's an example of why this might be necessary.
For example: Jane Smith (an employee) sold product X at location B because she was covering for someone else that day, but she is normally assigned to location A. Ergo, her employee dimension record would show that she is normally associated with location A, but the fact record for that particular sale would show it as having occurred at location B.
That's a good example of why you'd have a separate location dimension ... but you'll note that I'm still including the employee's assigned location in the employee dimension.
But if there's not a good reason compelling you to split your dimension, then I wouldn't. It's just going to make your schema more complex, your ETL's (or ELT's) more complex, etc.
Upvotes: 2
Reputation: 89361
Should I add Department_DIM
If a Department has other attributes you need to report on, then you should consider splitting it out. This "normalization of dimensions" is also called "snowflaking", or moving from a star schema to a snowflake schema. But it's totally optional and people do it both ways.
Then the Fact_Table will also have DepartmentID
This a separate decision. This is denormalizing by putting non-key dimension attributes directly on your fact table, and it's a performance optimization. You might do this for a few key attributes.
Upvotes: 0