Reputation: 97
Two part architecture question:
I have employee, job title, and supervisor dimensions. I kind of wanted to keep them in one dimension and have something like site > supervisor > job title > employee. The problem is that these need to be SCD. That is, they have historical associations to relate to the facts. The fact tables have a requirement to be processed every five minutes (dashboard).
1) Should I have these in a single dimension with a surrogate key (or composite for that matter)? The keys/surrogate key would be composed of calendar_id - employee_id.
2) Have the fact tables have maintain a reference to three different dimensions instead?
The requirement to process every 5 minutes (MOLAP SSIS ETL driven processing). Makes me lean toward keeping the time/change in the facts so that I would ease having to process the dimensions along with the fact tables.
Upvotes: 0
Views: 42
Reputation: 1405
I would design it as a single dimension, with the hierarchy you mentioned: site > supervisor > job title > employee.
Let's call this dimension EmployeeAssignment, because its granularity is not Employees, but any combination of site/supervisor/job title that an employee "adopts" during his/her career. (Feel free to come up with a better name).
I don't think you need a calendar_id key in this dimension: a surrogate key based on DISTINCT SiteID,SupervisorID,JobTitleID,EmployeeID would be enough. Adding a calendar_id key would be making the dimension do too much work: over and above slicing the actual facts, this would make the dimension answer questions like
"Where was employeeID 12345 (in the site/supervisor/job title network) on 1 January 2015?" and "How many employees did supervisorID 98765 supervise on 1st January 2015?"
These questions IMHO are best addressed with a fact, not a dimension. One cube I've worked on addresses with with an EmployeeDay measure: sliced by dimensions "EmployeeAssignment" and Time, this simply has a 1 if the employee is in that "assignment" on that day.
This EmployeeAssignment SCD is actually pretty slowly-changing, especially compared to your 5-minute fact update interval. Employees are not going to move about or get promoted every 5 minutes, so a reprocess of the dimension shouldn't be necessary more often than daily.
If I've misunderstood anything, let me know in the comments.
Upvotes: 0