Reputation: 2916
(Excluding some headers), is it in general possible to decompress (eg. unzip) arbitrary data, coming from eg. /dev/random
?
In other words, does compressed data exhibit some clear patterns that distinguish it from random data?
Note regarding headers: if you try to unzip random data like this:
head -c 100 /dev/random > file
unzip file
You get an error
Archive: file End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of file or file.zip, and cannot find file.ZIP, period.
Upvotes: 1
Views: 2892
Reputation: 112189
Yes, there are many specific format aspects that must be correct, lest the decompressor abort.
First off, the header formats for unzip, gzip, zlib, etc. all have magic numbers in the first few bytes for precisely this purpose, in order to avoid wasting time on a file that wasn't make by zip, gzip, zlib, etc. in the first place.
Even if you get all the right header data, then the compressed data itself will almost always have format constraints that have to be met. I have done testing that shows that random data presented as deflate compressed data is caught on average within about 100 bytes.
Upvotes: 1