gxor
gxor

Reputation: 353

Huffman in JPEG decoding: what are symbols, codes and length

I'm following Daniel Harding's youtube playlist "Everything you need to know about JPEG" to understand how the JPEG File Format is defined.

In the decoding process, the code looks something like this:

length = getNextSymbol()
...
coefficient = readBits(length)
mcu[0] = coefficient

and the getNextSymbol function does smth like this:

currentCode = 0;
for i < 16:
    currentCode = (currentCode << 1) | readNextBit()
    for each huffman_code with length i:
        if currentCode == huffman_code:
            return huffman_symbols[huffman_code]

the complete code is hosted on github: https://github.com/dannye/jed/blob/master/src/decoder.cpp

So we're first getting the symbol which should be the value that we want to parse. But after that we read the length of this symbol. Is the huffman table only storing how many bits to read and not the "real" value. But in AC decoding we store the symbol directly into our mcu values!

What I understood:

Question: Why are we reading the symbol and then converting the symbol into a length, reading this length and storing the value that we read. But in AC we store the symbol without reading the bit.

What am I missing here? Thanks for helping, its really tough to understand for me!

Upvotes: 0

Views: 1122

Answers (1)

Mark Adler
Mark Adler

Reputation: 112374

A DC component is represented by a Huffman coded bit count, followed by that number of bits interpreted as a signed integer. That integer is added to the DC coefficient of the last block to get the DC coefficient of this block. (For the first block, the "previous" DC coefficient is taken to be zero.)

An AC component is represented by Huffman coded run length/bit count, followed by that number of bits interpreted as a signed integer. The run length is in the high four bits of the decoded symbol, and the bit count is in the low four bits. Each such component results in a sequence of zero coefficients whose length is the run length, followed by a coefficient whose signed value is the bits that followed the code.

In both cases some number of bits are fetched after the Huffman code to get a coefficient value. So I don't know what you mean by "But in AC decoding we store the symbol directly into our mcu values!"

Upvotes: 3

Related Questions