CJ Cobb
CJ Cobb

Reputation: 105

Does Postgres store bytea data as hex on the server?

To work with bytea values in PostgreSQL, I usually am serializing to and deserializing from hex. This seems to be the preferred way. However, what is actually stored on the PostgreSQL server? Is it hex, or the unhexed binary? The reason I care is that hex is obviously going to take up double the space as unhexed binary. When I say unhexed binary, I mean the hex string "00", which is 2 bytes, is just "0", which is 1 byte, as unhexed binary.

The context is I have a Postgres database and a Scylla database that are storing the exact same data in almost the exact same format. However, the total space used by Postgres is almost exactly double the space used by Scylla. For Scylla, I don't encode binary as hex. I just send raw binary over the wire. I don't expect these two databases to use the exact same amount of space. But for PostgreSQL to use double the space is quite a lot of overhead, and the nearly exact doubling really makes me suspect data is being stored as hex and not actual binary on the server (since hex uses exactly double the space as actual binary).

Upvotes: 2

Views: 537

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 248215

A bytea is stored in binary form, not hex encoded, which would be enormously wasteful. The hex representation is just the default text representation generated by the type output function.

I don't know Scylla, so I cannot explain the difference, but PostgreSQL has substantial overhead per row (23 bytes), and there is also some overhead per 8kB block.

You say in your comments that you measured the database size, which includes all the metadata and indexes. I suggest that you use pg_table_size to measure the table.

Note that PostgreSQL automatically compresses values if a table row would otherwise exceed 2000 bytes.

Upvotes: 2

Related Questions