Reputation: 31
I have to build a cube based on date range records, and not sure about the best way to proceed.
Imagine say a cube of Cars and warranty periods. Each car has a start date, and an end of warranty periods. Then there may be extended warranty periods.. so imagine
CAR REG TYPE WARRANTY START WARRANTY END
CAR A PURCHASE 01/01/2016 31/01/2016
CAR A EXTENDED 01/01/2017 30/06/2017
CAR A EXTENDED 01/08/2017 30/01/2018 -- note, gap here
CAR B PURCHASE 01/01/2016 31/01/2016
CAR B EXTENDED 01/01/2017 30/06/2017
CAR B EXTENDED 01/08/2017 30/01/2018 -- note, gap here
So multiple items, with multiple date ranges. There is a main table (CARS) with car details (colour, model, etc).
Now I want to build a cube, which is reportable at month level, cars under warranty/warranty type, etc.
So plan 1 was to build a view which explodes the above out by a join to a date table, report by month, and feed this into a cube. But, the number of cars multiplied by the months covered leads to multi hundreds of milions of rows - which means sometimes the server runs out of TempDB space, and when it does run, the cube takes hours to build.
Is there a better way - such as a view for the car details, and then another view on the warranty table (how do I get SSAS to deal with months in a date range) - will the join in SSAS be more efficient than a join in a view in SQL?
Thanks all.
Upvotes: 0
Views: 2028
Reputation: 11625
One approach which will work with drag-and-drop client tools like Excel or Power BI would be a many-to-many Date dimension. Since car A and B match, let's pretend there's a car C which has a warranty from 2015-07-30 to 2015-12-31.
Create a DimWarrantyDateRangeKey which represents a unique combination of dates during which a warranty is active. The surrogate key is WarrantyDateRangeKey. Certainly the ETL that builds this table will be a bit expensive, but given the size of your data I think it's a worthwhile investment that will produce better query performance than if your m2m bridge table has one row per active day per car.
Each car should be assigned one WarrantyDateRangeKey. Add the WarrantyDateRangeKey column to your fact tables...
CAR REG WarrantyDateRangeKey
A 1
B 1
C 2
m2mWarrantyDateRange
WarrantyDateRangeKey DateKey
1 20160101
1 20160102
1 ...
1 20170629
1 20170630
1 20170801
1 20170802
1 ...
1 20180129
1 20180130
2 20150701
2 20150702
2 ...
2 20151230
2 20151231
The tables relate together as follows...
FactTable -> DimWarrantyDateRange <- m2mWarrantyDateRange -> DimDate
Then in the cube you DimWarrantyDateRange should be a dimension, m2mWarrantyDateRange should be a measure group with a count measure. DimDate should be a dimension. Then you should relate DimDate to FactTable as a many-to-many (m2m) dimension using m2mWarrantyDateRange as the intermediate measure group.
Now in Excel or Power BI you should be able to filter to a particular date and it will filter down to the cars which had an active warranty on that day.
Upvotes: 1
Reputation: 114
You can connect start and end columns to time dimension. And on the report you can use ":" operator to build date tange report. More details you will find here: http://www.purplefrogsystems.com/blog/2013/04/mdx-between-start-date-and-end-date/
Upvotes: 1