Reputation: 8672
It appears to be a common practice to let the time dimension of OLAP cubes be in a table of its own, like the other dimensions.
My question is: why?
I simply don't see what the advantage would be to have a time_dimension
table of (int, timestamp)
that is joined with the cube on some time_id
foreign key, instead of having a timestamp
column in the cube itself.
Principally, points in time are immutable and constant, and they are their own value. I don't find it very likely that one would want to change the associated value for a given time_id
.
In addition, the timestamp
column type is 4 bytes wide (in MySQL), as is the int
type that would otherwise typically be the key, so cannot be to save space either.
In discussing this with my colleagues, the only somewhat sensible argument I have been able to come up with is conformity with the other dimensions. But I find this argument rather weak.
Upvotes: 1
Views: 1069
Reputation: 148
Paddy is correct, the time dimension contains useful "aliases" for the time primitives. You can capture useful information about the dates themselves such as quarter, national holiday, etc. You can write much quicker queries this way because there's no need to code every holiday in your query.
Upvotes: 1
Reputation: 33857
I believe that it's often because the time dimension table contains a number of columns such as week/month/year/quarter, which allows for faster queries to get all of X for a particular quarter.
Given that the majority of OLAP cubes are written to get queries over time, this makes sense to me.
Upvotes: 3