Reputation: 11
I'm using a simple and correct LZW compression program, and I've found that for some .gif and .bmp files the program outputs a larger file for the original. Could anyone explain what factors would cause this outcome?
I assume that the original file is too random, but I'm not exactly sure how to show this.
Upvotes: 1
Views: 466
Reputation: 112374
That is not a failure. It would be a failure if the program were not lossless.
It is mathematically necessary that if some sequences are compressed losslessly, then there must exist some sequences that are expanded losslessly.
As to why your particular sequence is expanded as opposed to compressed, it is simply that that sequence does not have sufficient redundancy for LZW to take advantage of. LZW is an obsolete and relatively ineffective algorithm for compression, as compared to more modern algorithms. (Where "more modern" might mean 30 years old instead of 40 years old.)
Also LZW is one of the worst at how much incompressible data is expanded. You might consider prepending your result with one bit that indicates whether the subsequent data is compressed or not. Then you won't get the hit of LZW's expansion ratio.
You can use other compression algorithms, e.g. those implemented in gzip and xz, to get a feel for how inherently compressible your data is.
Upvotes: 1