Niels Kristian
Niels Kristian

Reputation: 8845

Will gzipping + storage as bytea save more disk space over storage as text?

If I have a table containing 30 million rows and one of the columns in the table is currently a text column. The column is populated with random strings of a size between 2 and 10 kb. I don't need to search the strings directly.

I'm considering gzipping the strings before saving them (typically reducing them 2x in size) and instead save them in a bytea column.

I have read that Postgresql does some compression of text columns by default, so I'm wondering: Will there be any actual disk space reduction as a product of the suggested change?

I'm running Postgresql 9.3

Upvotes: 4

Views: 1086

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246788

PostgreSQL stores text columns that exceed about 2000 bytes in a TOAST table and compresses the data.

The compression is fast, but not very good, so you can have some savings if you use a different compression method. Since the stored values are not very large, the savings will probably be small.

If you want to go that way, you should disable compression on that already compressed column:

ALTER TABLE tab
   ALTER bin_col SET STORAGE EXTERNAL;

I'd recommend that you go with PostgreSQL's standard compression and keep things simple, but the best thing would be for you to run a test and see if you get a benefit from using custom compression.

Upvotes: 4

Related Questions