MasterID
MasterID

Reputation: 1510

Compressing increasing sequence of unique integers

I have a sequence of two-byte integers ranging from 0 to 2^16-1 = 65535. They are sorted so the sequence is always increasing and there are no repetitions.

This sequence usually have around 270~500 numbers. If it were denser(ie >=32768 elements), I could save the numbers that are not in the sequence, but thats not the case... Now here is the challenge, I have to compress then using less than 6 bits per integer(on average)!

The best guess I had was to use Bloom Filters. In this manner, to decompress the sequence, I would have to iterate over all integers from 0 to 65535 asking if they are in set. But I don't know how to handle false positives. I could store them, but I'm afraid it would take too much data.

Upvotes: 0

Views: 267

Answers (1)

Mark Adler
Mark Adler

Reputation: 112642

It is not possible to use less than 6 bits per integer on average. You can count the number of ways to pick 270 or 500 things from a set of 65536 without repetitions, and determine that the number of bits required to represent a pick is 9.34 bits per thing and 8.46 bits per thing respectively.

If you were picking 2721 things or more out of 65536, then you could represent them in 6 bits or less.

Upvotes: 1

Related Questions