Reputation: 1510
I have a sequence of two-byte integers ranging from 0 to 2^16-1 = 65535. They are sorted so the sequence is always increasing and there are no repetitions.
This sequence usually have around 270~500 numbers. If it were denser(ie >=32768 elements), I could save the numbers that are not in the sequence, but thats not the case... Now here is the challenge, I have to compress then using less than 6 bits per integer(on average)!
The best guess I had was to use Bloom Filters. In this manner, to decompress the sequence, I would have to iterate over all integers from 0 to 65535 asking if they are in set. But I don't know how to handle false positives. I could store them, but I'm afraid it would take too much data.
Upvotes: 0
Views: 267
Reputation: 112642
It is not possible to use less than 6 bits per integer on average. You can count the number of ways to pick 270 or 500 things from a set of 65536 without repetitions, and determine that the number of bits required to represent a pick is 9.34 bits per thing and 8.46 bits per thing respectively.
If you were picking 2721 things or more out of 65536, then you could represent them in 6 bits or less.
Upvotes: 1