sam
sam

Reputation: 1401

Error `invalid distance too far back` when Inflate HTML gZIP content

I want to inflate HTML webpages. I am using zlib functions inflateInit2(&zstream,15+32); and then inflate(&zstream,Z_SYNC_FLUSH);

It works for lots of webpages correctly but for "www.tabnak.ir" it does not work correctly. invalid distance too far back is the ERROR I got for this website. This webpage is also gzip and utf8. How should I deal with that?

This is For Bing.com which works Fine

1f 8b 08 00    ef 8c 77 56    00 ff ec 5a    eb 73 9c 46
12 ff 9e aa    fc 0f 04 d5    9d ad 78 1f    c0 3e b4 0b
96 52 b2 24    2b ba 73 1c    9d 2d 27 b9    8a af b6 06

This is For tabnak.ir which results in invalid distance too far back Error

1f 8b 08 00    00 00 00 00    00 03 ed fd    db 73 5b d7
99 2f 8a 3e    ab ab d6 ff    30 ac ae ac    d8 3b 82 80
39 71 a7 6d    55 39 89 7b    75 f7 4a d2    7d 92 74 af 

Upvotes: 3

Views: 5826

Answers (2)

mksteve
mksteve

Reputation: 13085

The zlib/gzip format performs compression saying things like "The next 22 bytes are the same as the 22 bytes we saw 1013 bytes ago.

In this case the record describing the repetition, is from before the size of the 'window'.

Given you have specified a maximum size of window, the likelihood, is that the data format has changed a bit, or the data you received is not the same as was sent.

Some things to check.

  1. You are using the latest zlib library.
  2. Standard utilities (e.g. gunzip, winzip) can decompress the data.
  3. The data you are getting is not being mangled by a text filter ('rb' vs 'rt')

If that hasn't helped, try walking through the data and understanding what the failure in gzip is.

Upvotes: 1

gnasher729
gnasher729

Reputation: 52612

It would seem that the file you are trying to "inflate" (decompress using zlib) is not a valid zip file. Since bing.com is most likely not a zlib file, it might be pure coincidence that you found something quite early that prevented decompression.

Upvotes: -2

Related Questions