Kyle Hobbs
Kyle Hobbs

Reputation: 457

Counting Minimum Number of Bytes Required to Represent a Signed Integer

My task seems simple, I need to calculate the minimum number of bytes required to represent a variable integer (for example if the integer is 5, then I would like to return 1; if the integer is 300, I would like to return 2). Not referring to the data type int which is, as pointed out in comments, always just sizeof(int), I'm referring to a mathematical integer. And I almost have a solution. Here is my code:

int i;
cin >> i;
int length = 0;
while (i != 0) {
    i >>= 8;
    length++;
}

The problem is that this doesn’t work for any negative numbers (I have not been able to determine why) or some positive numbers where the most significant bit is a 0 (because the sign bit is the bit that makes it one bit larger)... Is there any hints or advice I can get in how to account for those cases?

Upvotes: 0

Views: 2295

Answers (3)

Steve Ward
Steve Ward

Reputation: 1078

#include <algorithm>
#include <bit>

constexpr unsigned int
byte_width(const size_t x)
{
    // std::bit_width(0) returns 0, but we want it to be 1
    const unsigned int w = std::max(std::bit_width(x), 1);
    return (w / 8) + (w % 8 != 0);
}

For negative integers, return sizeof(x).

Upvotes: 0

Barmak Shemirani
Barmak Shemirani

Reputation: 31629

Stored as a single byte,
Positive numbers are in the range 0x00 to 0x7F
Negative numbers are in the range 0x80 to 0xFF

As 2-bytes,
Positive numbers are in the range 0x0000 to 0x7FFF
Negative numbers are in the range 0x8000 to 0xFFFF

As 4-bytes,
Positive numbers are in the range 0x00000000 to 0x7FFFFFFF
Negative numbers are in the range 0x80000000 to 0xFFFFFFFF

You can use a function like the following to get the minimum size:

int getmin(int64_t i)
{
    if(i == (int8_t)(i & 0xFF))
        return 1;
    if(i == (int16_t)(i & 0xFFFF))
        return 2;
    if(i == (int32_t)(i & 0xFFFFFFFF))
        return 4;
    return 8;
}

Then for example, when you see 0x80, translate it to -128. While 7F is translated to 127, and 0x801 should be translated to a positive number.

Note that this will be very difficult and rather pointless, it should be avoided. This doesn't accommodate storage of numbers in triple bytes, for that, you have to make up your own format.

Upvotes: 1

Ryan Schaefer
Ryan Schaefer

Reputation: 55

The range of signed numbers possible to store in x bytes in 2's complement is -2^(8*x-1) to 2^(8*x-1)-1. For example, 1 byte can store signed integers from -128 to 127. Your example would incorrectly calculate that only 1 byte is needed to represent 128 (if we are talking about signed numbers), as right shifting by 8 would equal zero, but that last byte is required to know that this is not a negative number.

For handling negatives, turning it into a positive number and subtracting one (because negative numbers can store an extra value) will allow you to right shift.

int i;
cin >> i;
unsigned bytes = 1;
unsigned max = 128;

if (i < 0) {
    i = ~i; //-1*i - 1
}

while(max <= i) {    
    i >>= 8;
    bytes++;
}
cout << bytes;

Another option is to use __builtin_clz() if you are using gcc. This returns the leading zeros, which you can then use to determine the minimum number of bytes.

Upvotes: 0

Related Questions