Reputation: 43
I have a report in which one of the columns contains many unique integers. I saw that the data format is decimal (double) , I decided to convert this column to int64 (integer) format. In dax studio, VertiPaq actually showed a decrease in the memory occupied by the column
, but when saving the report in the desktop version, the weight of the report itself as a file increased. How can this be?
Upvotes: 1
Views: 186
Reputation: 30289
Compression is based on a number of factors. The number of distinct values and the sort order of columns plays a big part. VertiPaq will also use some heuristics to calculate the optimum compression strategy for the data and this can change based on many different variables.
In your case, hash encoding has been used which requires a dictionary. This will also probably use run length encoding which means the sort order of the columns can make the memory footprint fluctuate.
Data compression in VertiPaq is quite involved. I can recommend Chapter 17 in the Definitive Guide to DAX if you want all the details. I quote the following excerpt.
The factors influencing the compression ratio of a Tabular model are, in order of importance:
- The cardinality of the column, which defines the number of bits used to store a value.
- The number of repetitions, that is, the distribution of data in a column. A column with many repeated values is compressed more than a column with very frequently changing values.
- The number of rows in the table.
- The data type of the column, which only affects the dictionary size. Given all these considerations, it is nearly impossible to predict the compression ratio of a table
Upvotes: 1