Processing partitions takes longer than processing entire database

Question

I have a Tabular Model cube where I have split the tables into partitions to make processing more efficient.

When I Process Full the daily partition only, it takes 2h 45m. However, when I Process Full the entire database (that includes daily and historical data), it takes 1h 10m.

Anyone know what can be causing this?

Thanks!

Denny Lee · Accepted Answer

ProcessFull within a Tabular model basically is a combination of ProcessData (grab the data from the source, build dictionaries, etc.) and ProcessReCalc (build up indexes, attribute hierarchies, etc.). While the ProcessData is only grabbing the most recent data (i.e. the data for the partition), the ProcessReCalc itself needs to be executed on the entire database. A good reference is Cathy Dumas' blog post: http://cathydumas.com/2012/01/25/processing-data-transactionally-in-amo/

To get to the cause of the processing, best to dig into the profiler traces / logs to determine what actions are taking a very long time for the processing to complete. By any chance is your data something that has a lot of repeating set of data such as audit logs? It may be possible that its faster to do the entire database (vs. a single partition) because it's able to more efficiently compress and organize the data because the repeated data can be better compressed thus taking up less memory. A potential way to check this is to see what the model size is after running ProcessFull on the partition vs. running it on the entire database. If it is true, the latter processing will result in a smaller sized database.

HTH!

Processing partitions takes longer than processing entire database

Answers (1)

Related Questions