Mark S
Mark S

Reputation: 21

zlib inflate() and loss of data over network link

I have 2 application that communicate over a network link. The sender will compress a packet using zlib's deflate() and send it over the network. The receiver will then decompress the data using inflate(). It is possible that packets will be lost over the network link; and in order to minimize decompression errors I have implemented the following approach:
Sender

  1. calls deflate() with Z_SYNC_FLUSH most of the time, but intermittently calls deflate() with Z_FULL_FLUSH.

  2. sends (along with the data) a 2-byte field that contains a bit indicating whether or not a FULL_FLUSH or SYNC_FLUSH was used and a sequence number.

Receiver

  1. Reads in the data; and using the sequence number, detects if a packet has been lost. When there is NO packet lost, the 2-bytes are removed and the decompression works properly.
  2. When a packet lost is detected, the receiver checks whether or not the current packet is a FULL_FLUSH or a SYNC_FLUSH packet.

    • If it's a SYNC_FLUSH, then the packet is simply dropped and we proceed with the next packet.

    • If it's a FULL_FLUSH; however, the receiver removes the extra 2-byte and calls inflate().

This works 99% of the time; in the sense that the inflate() succeeds and the uncompressed data is indeed the same that the sender had before compression. This was my expectation!
Once in a while; however, this approach puts the receiver in a bad state where every subsequent packet (the FULL_FLUSH packet included) fails to decompress. inflate() returns a Z_DATA_ERROR and the zlibContext.zstream.msg contains 'incorrect header check'; although I have occasionally received a 'invalid distance too far back' message.

My first question is

Should I expect to recover and inflate() successfully when the packet at hand was compressed using a FULL_FLUSH flush mode; even if previous packets were lost? For example, sender compresses using deflate(ctx, SYNC_FLUSH) the first 3 packets and sends them; one at a time, over the network. The sender then compresses the fourth packet using deflate(ctx, FULL_FLUSH) and sends it across the network. The receiver receives packet 1 & 2 and calls inflate() with success. The receiver then receives packet 4; it detects (via the sequence #) that it has missed packet 3. Since packet 4 was compressed using a FULL_FLUSH, the receiver expects that the inflate() will successfully decompress the packet. Is this a valid assumption?

My second question is

Is there anything else I need to do in in the receiver to be able recover from packet loss and continue decompressing packets?

Upvotes: 2

Views: 587

Answers (2)

Mark Adler
Mark Adler

Reputation: 112284

If you properly break the compressed stream after the full flush, then yes, what follows will always be decompressible with a new or reset instance of inflate.

After the full flush, you must call deflate() until avail_out is not zero, in order to emit the end of the previous stream. What follows that is what you would put in your packet labeled as following a full flush. It is possible that you are not properly locating the start of the compressed data following the full flush, since depending on your buffer size, the flush may happen to be completed most of the time on the first deflate() call.

On the receiver side, make sure that the inflator is starting fresh and decoding in raw mode, which would be done with an inflateInit2() with windowBits equal to -15, or inflateReset() on such a raw inflate state. You must already be doing that if it is working 99% of the time, or ever for that matter.

Upvotes: 1

David Schwartz
David Schwartz

Reputation: 182753

Your logic is slightly wrong. A FULL_FLUSH packet does a full flush. And after that packet, the state is flushed. By processing the FULL_FLUSH packet, you are attempting to perform the flush -- but you can't, because you don't have the right state to perform the flush.

You can, however, resume after the flush. Because after a flush, the state is flushed.

So after a loss, you don't want to process the FULL_FLUSH packet because you don't have the context necessary to process it. However, after that packet, the state has been fully flushed, so you can resume inflation with the next packet.

So your packet loss logic should be:

  1. Wait until you receive a packet with the FULL_FLUSH bit set.
  2. Wait for the next packet.
  3. If there has been no packets lost between the FULL_FLUSH packet and this packet, then resume inflation (with a clean context!) starting with this packet.

Upvotes: 1

Related Questions