Reputation: 7994
I have the following problem I am unable to solve gracefully.
I have a data type that can take 3 possible values (0,1,2). I have an array of 20 element of this data type.
As I want to encode the information on the least amount of memory, I did the following :
char
holds 8 bits, so I can put 4 times an elementchar
holds 40 bits, so I can store 20 elements.I have done this and it works time.
However I'm interested evaluating the space gained by using the fact that my element can only take 3 values and not 4.
Every possible combination gives us 3 to the 20th power, which is 3,486,784,401. However 256 to the 4th power gives us 4,294,967,296 , which is greater. This means I could encode my data on 4 char
.
Is there an generic method to do the 2nd idea here ? The 1st idea is simple to implement with bit mask / bit shifts. However since 3 values doesn't fit in an integer number of bits, I have no idea how to encode / decode any of these values into an array of 4 char.
Do you have any idea or reference on how it's done ? I think there must be a general method. If anything I'm interested about the feasability of this
edit : this could be simplified to : how to store 5 values from 0 to 2 into 1 byte only (as 256 >= 3^5 = 243)
Upvotes: 2
Views: 383
Reputation: 5751
There is a generic way to figure out how much bits you need:
If your data type has N different values, then you need log(N) / log(2) bits to store this value. For instance in your example, log(3) / log(2) equals 1.585 bits.
Of course in reality you will to pack a fixed amount of values in an integer number of bits, so you have to multiply this 1.585 with that amount and round up. For instance if you pack 5 of them:
1.585 × 5 = 7.925, meaning that 5 of your values just fit in one 8-bit char
.
The way to unpack the values has been shown in JS1's answer. The generic formula for unpacking is element[i] = (value / (N ^ i) ) mod N
Final note, this is only meaningful if you really need to optimize memory usage. For comparison, here are some popular ways people pack these value types. Most of the time the extra space taken up is not a problem.
bool
: uses 8 bits to store one bool
. And a lot of people really dislike the behavior of std::vector<bool>
.enum Bla { BLA_A, BLA_B, BLA_C};
an array or vector of Bla
probably uses 32 bits per element (sizeof(Bla) == sizeof(int)
).Upvotes: 0
Reputation: 4767
You should be able to do what you said using 4 bytes. Assume that you store the 20 values into a single int32_t
called value
, here is how you would extract any particular element:
element[0] = value % 3;
element[1] = (value / 3) % 3;
element[2] = (value / 9) % 3;
...
element[19] = (value / 1162261467) % 3; // 1162261467 = 3 ^ 19
Or as a loop:
for (i=0;i<20;i++) {
element[i] = value % 3;
value /= 3;
}
To build value
from element
, you would just do the reverse, something like this:
value = 0;
for (i=19;i>=0;i--)
value = value * 3 + element[i];
Upvotes: 5