What is the actual structure of a dynamic huffman block?

Question

According to RFC1951:

  3.2.7. Compression with dynamic Huffman codes (BTYPE=10)

     The Huffman codes for the two alphabets appear in the block
     immediately after the header bits and before the actual
     compressed data, first the literal/length code and then the
     distance code.  Each code is defined by a sequence of code
     lengths, as discussed in Paragraph 3.2.2, above.  For even
     greater compactness, the code length sequences themselves are
     compressed using a Huffman code.

I am really sorry but mostly what i understood is not from RFC1951 but from answers here. The statement above does not give a clear picture to a beginner or any visual context of what the data looks like.

I read that they are two trees from zlib's doc algorithm.txt:

Literals or match lengths are compressed with one Huffman tree, and
match distances are compressed with another tree. The trees are stored
in a compact form at the start of each block.

As we all can see that the later points out clearly than the RFC1951, Which brings the question.

What comes after (HCLEN+4)*3 bits?

I know that you add 4 to the decimal value of the 4 bits HCLEN then extract the exact total of that number from bit stream where each extract is 3 bit number (0 to 7) and Thanks M. Adler for clearing out how bits are read from right to left.

=====================================
bitstream data
_____________________________________
|17|57|????????

Where does the first tree end?

My goal is to identify why 2 inputs sharing a prefix aren't sharing a few bytes somewhere in the compressed stream. I know its because of the Block header but since dynamic blocks construct trees based on input data then something has to be identical in theory because it's encoding not hashing, so the headers themselves should have something similar like this:

raw block data at (19, 3)
Dynamic Huffman tree: length codes: 13, distances codes: 29, code_lengths_length: 16
lengths: [4, 0, 6, 4, 5, 4, 3, 3, 5, 6, 5, 2, 0, 0, 0, 0, 3, 5, 5]

raw block data at (19, 3)
Dynamic Huffman tree: length codes: 9, distances codes: 29, code_lengths_length: 16
lengths: [4, 0, 6, 5, 3, 3, 3, 3, 5, 6, 5, 3, 0, 0, 0, 0, 3, 5, 5]

raw block data at (20, 3) 
Dynamic Huffman tree: length codes: 9, distances codes: 30, code_lengths_length: 16
lengths: [3, 0, 5, 0, 6, 3, 4, 5, 6, 6, 6, 2, 0, 0, 0, 0, 2, 5, 5]
                          
deflate-blocktype 2 dyna huff beginning at (16, 0)
raw block data at (16, 3)
Dynamic Huffman tree: length codes: 18, distances codes: 30, code_lengths_length: 16
lengths: [4, 0, 6, 6, 4, 4, 3, 3, 5, 5, 5, 2, 0, 0, 0, 0, 3, 5, 5]

raw block data at (17, 3)  
Dynamic Huffman tree: length codes: 18, distances codes: 30, code_lengths_length: 16
lengths: [4, 0, 6, 5, 4, 4, 3, 3, 5, 6, 5, 2, 0, 0, 0, 0, 3, 5, 5]

raw block data at (17, 3)
Dynamic Huffman tree: length codes: 18, distances codes: 30, code_lengths_length: 16
lengths: [4, 0, 5, 5, 4, 3, 3, 3, 4, 5, 5, 3, 0, 0, 0, 0, 3, 5, 5]


4C 9D D5 CE 35 3D 19 40 CF 49
6C 9C D7 B2 F3 4A 15 84 EF A9
4C 9C D7 B2 F3 BA 0D 85 EF 33
94 9D D7 B2 EB 4A 11 86 EF A9
94 9D D7 92 EB BA 11 45 DF 5D

What is the actual structure of a dynamic huffman block?

Answers (1)

Related Questions