Reputation: 3369
The full question from our sample exam paper:
Explain by highlighting the relevant parts of the algorithm, why the GIF format is not the most compact format for representing images with natural content.
I understand that using GIFs wouldn't be great for natural images because it's limited to 256 colours, but why wouldn't it provide a sufficiently compact file? If anything, you'd think that less colours would imply smaller file-sizes.
In our notes we're told that the LZW compression used is better than Huffman for a few reasons (incl. the fact it only does one pass). Would Huffman encoding/compression result in a smaller file?
According to Wikipedia, the PNG format provides better compression than GIF. LZW is most likely the culprit then, but why? Which "parts of the algorithm" support the argument that it "is not the most compact format for representing images", particularly for natural images?
Upvotes: 3
Views: 548
Reputation: 12486
I'm not precisely sure on the details of LZW coding, but I believe it compresses data by building a dictionary of common bit sequences (which must be identical each time they appear). This means that line-drawings and so forth compress very well, because they contain many areas of "solid" colour (i.e. the same pixel colours repeated identically, many times in a row.) If you have an area of 100 white pixels, you can compress that by saying it's '100 of the same white pixel in a row'.
"Natural images", such as those produced by digital cameras, don't contain areas of solid colour. A photo of a blue wall will actually contain many different shades of blue - camera noise, at the very least, will make every pixel slightly different from its neighbours. LZW will not be able to find many repeated sequences, so it won't be able to compress the data much.
JPEG achieves smaller file sizes than GIF because it's not afraid to lose a bit of information from the picture - it's lossy. The trick is that JPG is designed to only lose information that humans are bad at seeing anyway. (See note 1.) We're good at seeing artifacts in "smooth" areas of pictures, but not good at seeing artifacts near sharp transitions in the image.
It's also an entirely different breed of compression algorithm, based on encoding the image as a 2-D frequency domain representation (digital signals processing type stuff) rather than trying to find repeating subsequences of bits.
Note 1: JPEG is a "perceptual" compression method. It plays to human visual strengths and weaknesses - small errors in "smooth" regions of colour are very easy to see, but small errors in "busy" areas of the image are quite easy to miss amongst the details.
MP3 audio works on the same principle. It throws away information that we can't hear; for example, if there's a loud sound followed by a quiet sound, we generally can't hear the quiet sound - our ears have been overwhelmed by the loud sound, and won't pick up the quiet sound that comes after it. The MP3 encoding throws away the quiet sound and concentrates on getting the loud sounds right.
Note 2:
I just realised I haven't actually addressed PNG.
PNG achieves better compression than GIF because it applies a pre-filtering step before the lossless compression (i.e. DEFLATE
, roughly equivalent to LZW.) See Wikipedia's explanation of PNG filtering.
The idea is that neighbouring pixels may be slightly different from each other (due to noise), but they will be roughly the same value.
Say we had the data:
132 133 134 135 136 137 138
LZW looks at this and says "these values are all different; I can't compress that."
PNG looks at this in terms of the differences between values:
132 +1 +1 +1 +1 +1 +1 +1
That is, the first data element is 132
, and the following elements are obtained by adding 1
. This representation compresses quite well with LZW.
Upvotes: 8