r_l
r_l

Reputation: 23

zlib compress() returns Z_BUF_ERROR despite buffer allocated to result of compressBound (file too big?)

When using zlib, my call to compress() gives a Z_BUF_ERROR when I attempt to compress a file that is 13G, despite what I believe to be correct buffer allocation. This code works on smaller files.

struct stat infile_stat;
FILE *fp = NULL;

if ((fp = fopen(md_of_name, "r")) == NULL) {
  fprintf(stderr,
          "Error: Unable to open file %s.\n",
          md_of_name);
  exit(1);
}

stat(md_of_name, &infile_stat);
size_t u_len = infile_stat.st_size;

char *u_buf = (char *)malloc(u_len);

if (u_buf == NULL) {
  fprintf(stderr, "Error: Unable to malloc enough memory for the "
                   "uncompressed buffer\n");
  exit(1);
}

if (fread(u_buf, 1, u_len, fp) < u_len) { // d
  fprintf(stderr,
          "Error: Unable to read in all of file %s. Exiting.\n ",
          md_of_name);
  exit(1);
}
fclose(fp);

size_t c_len = compressBound(u_len);

Bytef *c_buf = (Bytef *)malloc(c_len);

if (c_buf == NULL) {
  fprintf(stderr, "Error: Unable to malloc enough memory for the "
                  "compressed BIM buffer\n");
  exit(1);
}

fprintf(stderr, "u_len:%lu\tc_len:%lu\tc_buf:%p\n", u_len, c_len, c_buf);

int r = compress(c_buf, &c_len, (Bytef *)u_buf, u_len);

if (r == Z_MEM_ERROR)
  fprintf(stderr, "Not enough memory\n");
else if (r == Z_BUF_ERROR)
  fprintf(stderr, "Not enough room in the output buffer.\n");
assert(r == Z_OK);

When I run this on a file that is 13922075353 bytes, then output is:

u_len:13922075353   c_len:13926324460   c_buf:0x7f2b82436010
Not enough room in the output buffer.

Followed by the assert failure.

UPDATE

I believe this error is a result of a casting issue inside of the compress() function in zlib. If I am correct, the error is being returned on line 40 of compress.c in zlib 1.2.8 which is

if ((uLong)stream.avail_out != *destLen) return Z_BUF_ERROR;

That stream.avail_out variable is set a few lines above with:

stream.avail_in = (uInt)sourceLen;

I believe that the cast is the issue. sourceLen is an unsigned long, and when it is casted to a uInt bits are dropped. In my case sourceLen is 13922075353, destLen is 13926324460 (from compressBound()), but because of the cast stream.avail_out is 1041422572. Hence the error.

If this is correct, then there is an implicit bound on the size of the buffers. What I now do not understand is why the buffer sizes are unsigned long. They need to be unsigned ints.

Upvotes: 2

Views: 2678

Answers (2)

r_l
r_l

Reputation: 23

Now that I know what to look for, I see this issue is address in the zlib FAQ, which states that compress() and uncompress() may be limited to 4GB, since they operate in a single call."

I still think that compress and uncompress should not take sizes as an unsigned long.

Upvotes: 0

Mark Adler
Mark Adler

Reputation: 112374

For something that big, you need to use deflateInit(), deflate(), and deflateEnd().

Upvotes: 1

Related Questions