Reputation: 979
I have not found an example or a way of building a dimension that contains schedule attributes. For example, in my scenario I'm building a data warehouse that will help to gather analytics on podcast/radio show episodes.
We have the following:
And I'm trying to add another dimension that contains schedule attributes about the podcast_show, for example, some shows air their episodes every day, others tuesdays and thursdays, others only saturdays.
dim_show_schedule (Option 1)
| schedule_key | show_key | time | sunday_flag | monday_flag | tuesday_flag | wednesday_flag | thursday_flag | friday_flag | saturday_flag |
|--------------|----------|-------|-------------|-------------|--------------|----------------|---------------|-------------|---------------|
| 1 | 0 | 00:30 | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
| 2 | 1 | 12:30 | 0 | 1 | 1 | 1 | 1 | 1 | 0 |
| 3 | 2 | 21:00 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
However, would it be better to have a bridge table with something like:
bridge_show_schedule (Option 2)
| show_key | day_key |
|----------|---------|
| 0 | 2 |
| 0 | 4 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 1 | 5 |
dim_show_schedule (Option 3) (suggested by @nsousa)
| schedule_key | show_key | time | day |
|--------------|----------|-------|-------------|
| 1 | 0 | 00:30 | tuesday |
| 1 | 0 | 00:30 | thursday |
| 2 | 1 | 12:30 | monday |
| 2 | 1 | 12:30 | tuesday |
| 2 | 1 | 12:30 | wednesday |
| 2 | 1 | 12:30 | thursday |
| 2 | 1 | 12:30 | friday |
| 3 | 2 | 21:00 | saturday |
I've searched in Kimball's Data warehouse lifecycle toolkit and could not find an example on this use case.
Any thoughts?
Upvotes: 0
Views: 161
Reputation: 4439
Option 1
You can create a scheduled dimension that has a unique record for every possible schedule (128 daily combinations) combined with every reasonable start time. Using 5 minute intervals would still be less than 37k rows which is trivial for a dimension.
Option 2
If you want to leverage a date dimension instead, create a "Scheduled" fact that relate the show dimension to the date dimension for that future date. This would be handled in your ETL process to map the relationship. Your date dimension should already have the week and day of week logic included. You could also leverage your Show duration attribute to create a semi-additive calculated measure to allow you to easily get the total programming for the period.
I would opt for Option 2 as it provides many more possibilities for analytics.
Upvotes: 0
Reputation: 4544
If you keep a dimension with a string attribute saying which days it’s on, e.g., “M,W,F”, the most entries you have are 2^7, 128. A bridge table is an unnecessary complication.
Upvotes: 1