Reputation: 86779
I am building a SSAS cube that includes several time dimensions that I want users to be able to use in their analysis. For example suppose I have the following dates (imagine some sort of work item tracking system)
Users might want to do the following
In short, they want to be able to see date / time comparisons between most (if not all) of these dates.
How can I provide this in a system with many dates? (preferably without adding a bazillion "difference between X and Y" fields)
Upvotes: 2
Views: 1757
Reputation: 15037
I would create a new fact, based on the existing fact, with a row per each existing fact row, per each meaningful date combination. Eg 1 row for 'Created Date to Closed Date', another row for 'Created Date to Last Edit Date'. I would limit this to the meaningful combinations, e.g. no rows for 'Closed Date to Created Date'. There may be additional useful combinations comparing a date to 'Today'.
I would build this using a View, with a series of SELECT and UNION ALL clauses - one per meaningful date combination. The design of each SELECT clause would be to return all rows and columns from the existing fact data, plus a character column to describe the meaningful date combinations, plus a SQL calculation of the datediff in days and another in hours.
Then I would add this new View to the cube, "copying" (I wish) the dimension relationships from the current fact. I would create Avg, Min and Max variants of the datediff in days and also hours (6 measures).
Finally I would build a fact (degenerate/junk) dimension based on the column to describe the meaningful date combinations.
Upvotes: 0
Reputation: 13315
Keep it
A simple approach could be to just implement these measures, as for four date types there would only be six different differences, and applying three different aggregations, you would end up with 18 measures, which would be reasonable. If you use the "Display Folder" property of the measures (and your client tool uses these), I would think users would be able to easily find their way around with even forty or fifty measures. And there will for suer be some combinations that do not make sense from a business point of view, even if you could technically implement them.
Utility Attributes
Another way would be to look at this from a dimensional point of view, as you are using a multidimensional cube anyway: Create a utility dimension with three attributes:
This dimension can be completely unrelated to any measure group. The table on which it is based would contain all (valid) combinations of these attributes.
You would only use a single measure "time diff", which would probably be a calculated measure.
Possibly rename the attributes and members as well as the measure in a way that is easy to understand for end users.
Then you would implement in the calculation script the calculation of measure "time diff" in a way that checks the CurrentMember
of all three attributes, and then does the appropriate calculation.
This could look similar to:
CASE
WHEN IsError([Utility].[From Date].CurrentMember) // multi selection
OR [Utility].[From Date].CurrentMember IS [Utility].[From Date].[All]
OR IsError([Utility].[To Date].CurrentMember)
OR [Utility].[To Date].CurrentMember IS [Utility].[To Date].[All]
OR IsError([Utility].[Aggregation].CurrentMember)
OR [Utility].[Aggregation].CurrentMember IS [Utility].[Aggregation].[All] THEN
NULL // these special cases are invalid
ELSE
CASE
WHEN [Utility].[Aggregation].CurrentMember IS [Utility].[Aggregation].[Minimum] THEN
Min(Leaves(),
CASE
WHEN [Utility].[To Date].CurrentMember IS [Utility].[To Date].[Started] THEN
Measures.StartDate
WHEN [Utility].[To Date].CurrentMember IS [Utility].[To Date].[Closed]
Measures.ClosedDate
...
END
-
CASE
WHEN [Utility].[From Date].CurrentMember IS [Utility].[From Date].[Started] THEN
Measures.StartDate
WHEN [Utility].[From Date].CurrentMember IS [Utility].[From Date].[Closed]
Measures.ClosedDate
...
END
)
WHEN [Utility].[Aggregation].CurrentMember IS [Utility].[Aggregation].[Minimum] THEN
Max(Leaves(),
... // repeat code from above
)
WHEN [Utility].[Aggregation].CurrentMember IS [Utility].[Aggregation].[Average] THEN
Avg(Leaves(),
... // repeat code from above
)
END
END
This code might need some performance tuning, depending on the size of the cube, but you should get the idea: just check all cases of the attribute and implement the appropriate calculation.
The disadvantages of this second approach would be that users would need some education, as for the "time diff" measure to work, they will have to add three more attributes to every query, and apply selections as appropriate. Depending on the client tool, it is not possible so have three attributes with individual selections e. g. in two columns like
Attrib | Col 1 | Col 2
From | Created | Started
To | Started | Closed
Agg | Min | Max
but only cross products of selected members.
And you have to take care/give understandable messages to users for cases like multi-selections or no selection at all (i. e. CurrentMember
is the All
member) for the utility attributes.
After all the disadvantages, here is a description how the utility dimension approach can be applied to a different subject, in this case, time calculations like Year over Year growth and Year to Date: http://sqlblog.com/blogs/marco_russo/archive/2007/09/02/datetool-dimension-an-alternative-time-intelligence-implementation.aspx
Mixed Approach
Finally, you could use a mixed approach, where you e. g. create measures for all combinations of date types, but a utility attribute for the aggregations.
Upvotes: 1