Dan
Dan

Reputation: 10680

Efficiently displaying Top N in Excel PivotTable from SSAS Tabular Model

I have a simple tabular model consisting of a fact table with approx. 20 mio. records (sales transactions) and a dimension table with approx 600.000 records (customers).

A typical reporting scenario is to get the top 10 customers over some measure in the fact table, possibly filtered by some other criteria (typically a time period, a product, etc.).

In Excel, aggregating all of the 20 mio. records to return a total sales amount is instant. However, once I try to group by customers, it takes some time (15-20 seconds) to retrieve all the data, which makes sense since there is quite a lot of customers (600.000) that needs to be displayed.

Now, if I apply a value filter in Excel, to get only the top 10 customers, it still takes around 15-20 seconds to return the result, which is unacceptable to my users (as they would like to instantly see the top 10 customers while slicing on other attributes such as product, time, etc).

Internally, Excel uses the TOPCOUNT MDX-function when querying the tabular model with a value filter.

Is there anything I can do in the tabular model, to speed up these kinds of queries?

I've tried:

Upvotes: 2

Views: 1403

Answers (1)

Dan
Dan

Reputation: 10680

After searching the web some more and performing additional tests, I have come to the conclusion that SSAS Tabular is inherently bad at answering TOPCOUNT/RANK-kind queries in an unfiltered context. The reason seems to be quite simple:

If I have a dimension containing 600.000 customers, and I ask for the top 10 by Sales Amount in an unfiltered context, the tabular engine needs to compute the sum of Sales Amount for each individual customer, before being able to sort and return the top 10 or the rank of each customer.

In multidimensional OLAP, data is typically preaggregated on a pr. customer level, meaning the multidimensional cube can answer these kinds of queries much faster.

When applying one or more filters in SSAS tabular (for example current month, a specific product, etc.), I saw a significant performance increase. The solution, in my case, is then to educate my users to always filter their data before including the customer dimension in their PivotTables.

Upvotes: 1

Related Questions