DaviFN
DaviFN

Reputation: 1

Can true random data be losslessly compressed by this idea?

I am aware lossless compression relies on statistical redundancy. I had this idea on compression a random binary string, though, and I would like to know if (and why) it could/couldn't work:

As the binary string is random, it is expected that the probability of a different bit than the last bit is one half. That is, if the bit string is ...01101, the probability of the next bit being 0 is one half. That being said, half of the data is expected to "change its digit flow" at "one", let's say. Let's call N consecutive binary digits a "sequence" (note: a sequence of ones relies between zeros and vice-versa).

That being said, at random, it is expected: 1/2 (50%) of sequences of one digit 1/4 (25%) of sequences of two digits 1/8 (12.5%) of sequences of three digits 1/16 (6.25%) of sequences of four digits ... 1/(2^N) of sequences of N digits

Could this be exploited in order to compress the data? Such as:

Considering an infinite random binary string, picking up a sample of 2^M sequences, we know half of them will be sequences of one, one fourth will be sequences of two and so on. What is the proper logic to apply in order to compress the random data with efficiency? And, if not possible, why not possible?

Upvotes: 0

Views: 228

Answers (1)

Mark Adler
Mark Adler

Reputation: 112284

No. Not by any idea.

If all files are compressed by even just one bit each, then by simple counting, you are assured that at least two distinct files are compressed to the exact same thing. (Actually far more than that, but I only need two to make the point.) Now your decompressor will produce a single result from that compressed input. That single result can match at most one of the distinct files. It therefore cannot losslessly compress and decompress the one that it didn't match.

Upvotes: 1

Related Questions