Reputation: 511
I have a query where I am grouping by a column with very large fields (most have thousands of chars) and seeing a performance hit. Grouping by other smaller columns does not have a proportional affect on performance.
My intuition is that the group by is based on hashing so size doesn't matter (I can't find docs on under the hood behavior)
Could there be something else at play here or does field size somehow affect performance of the group by?
Upvotes: 0
Views: 320
Reputation: 1269963
My intuition is that the group by is based on hashing so size doesn't matter
I'm a little confused by this reaction. Hashing requires processing the entire value of the key for two key operations:
I am not intimately familiar with the Postgres implementation of hash tables, but large key values also run the risk of eating up memory -- and that is going to slow down any algorithm.
I would expect hashing to have performance proportional to the key lengths.
Upvotes: 2
Reputation: 131
Have you tried creating a combined index on the fields that are being grouped?
Upvotes: 0