bphi
bphi

Reputation: 1107

Does zlib's "uncompress" preserve the data's original endianness, or does it do an endian conversion?

I am working with legacy C++ code that accesses two-byte integer data compressed in a sqlite database. The code uses zlib's uncompress function to extract the data, which comes out on my little-endian machine as little-endian values.

To allow for the possibility that this code may be ported to big-endian machines, I need to know if the data will always decompress in little-endian order, or if (instead) zlib will somehow do the conversion.

This is the only applicable tidbit I've been able to find for(from zlib's FAQ on their site):

  1. Will zlib work on a big-endian or little-endian architecture, and can I exchange compressed data between them? Yes and yes.

Doesn't really answer my question... I'm prepared to handle the endian conversion if needed. Is it safe to assume that the original input data endianness is what you get back out, regardless of the platform on which you run uncompress? (I don't have access to a big-endian machine at present on which to test this myself).

Upvotes: 0

Views: 953

Answers (2)

Jongware
Jongware

Reputation: 22478

RFC1950 specifically states how zlib's own meta-data multi-byte values are stored:

Within a computer, a number may occupy multiple bytes. All multi-byte numbers in the format described here are stored with the MOST-significant byte first (at the lower memory address). For example, the decimal number 520 is stored as:

         0     1
     +--------+--------+
     |00000010|00001000|
     +--------+--------+
      ^        ^
      |        |
      |        + less significant byte = 8
      + more significant byte = 2 x 256

So operations regarding multi-byte values for internal use of zlib must take endianness into account (which is what FAQ #26 answered).

The compressed data itself will be unchanged, because zlib compresses and decompresses with a granularity of bytes, and not larger units.

Upvotes: 1

Mark Adler
Mark Adler

Reputation: 112482

zlib compresses and decompresses a stream of bytes losslessly. So whatever endianess went in is exactly what comes out. This is entirely regardless of the endianess of the compressing and decompressing machines.

The FAQ entry refers to the fact that the code was written to be insensitive to the endianess of the architecture that the code is compiled to and run on.

Upvotes: 2

Related Questions