user3555115
user3555115

Reputation: 738

Convert a 4 byte data to int in C

I need to convert 4 byte data which is in below format to a original int value . I cannot change the below assignment of int to 4 bytes.

int main() {
    //code

    int num = 1000;

    char a[4];
    a[0] = ( char )(num>>24)  ;
    a[1] = ( char )(num>>16) ;
    a[2] = ( char )(num>>8) ;
    a[3] = ( char )num ;


    printf("Original number is:%d\n", (a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3] ) );
    return 0;
}

I was expecting output to be 1000, but output is 768. How do we restore the original number from above byte array ?Is this an endianess issue ?

Upvotes: 1

Views: 3054

Answers (1)

Eric Postpischil
Eric Postpischil

Reputation: 224596

a[0] = ( char )(num>>24) ;

That works “okay” in this example. However, in situations where num is negative, the result is implementation-defined (C 2018 6.5.7 5).

In the remaining assignments to a[1], a[2], and a[3], values that may exceed the range of char will be converted to char automatically. If char is signed, the results of these conversions are implementation-defined or a signal is raised (6.3.1.3 3). So that is a problem we will have to fix, below.

First, for num = 1000, let’s suppose that −24 is stored in a[3]. This is the result we would get by taking the low eight bits of 1000 and putting them in an eight-bit two’s complement char, which is likely what your implementation uses. Then, we have a[0] = 0, a[1] = 0, a[2] = 3, and a[3] = −24.

Now let’s consider a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3].

a[0] << 24 and a[1] << 16 both yield 0. a[2] << 8 is 3 << 8, which produces 768, or 300 in hexadecimal. a[3] is −24. While a[3] is a char, it is promoted to an int when used in an expression (6.3.1.1 2). Still assuming your C implementation uses two’s complement, the binary for −24 is 11111111111111111111111111101000, or ffffffe8 in hexadecimal.

When we bitwise OR 300 and ffffffe8, the result is ffffffe8, which, in a 32-bit two’s complement int, is −24.

The easiest way to fix this is to change char a[4]; to unsigned char a[4];. That avoids any negative char values.

However, to make your code completely work for any value of int (assuming it is four bytes and two’s complement), we need to make some other changes:

unsigned char a[4];

/*  Convert the signed num to unsigned before shifting.
    Shifts of unsigned values are better defined than shifts
    of signed values.
*/
a[0] = (unsigned) num >> 24;
a[1] = (unsigned) num >> 16;
a[2] = (unsigned) num >>  8;
a[3] = (unsigned) num;
/*  The cast in the last assignment is not really needed since
    we are assigning to an unsigned char, and it will be converted
    as desired, but we keep it for uniformity.
*/

//  Reconstruct the value using all unsigned values.
unsigned u = (unsigned) a[0] << 24 | (unsigned) a[1] << 16 | (unsigned) a[2] << 8 | a[3];

/*  Copy the bits into an int.  (Include <string.h> to get memcpy.)
    Note:  It is easy to go from signed to unsigned because the C standard
    completely defines that conversion.  For unsigned to signed, the
    conversion is not completely defined, so we have to use some indirect
    method to get the bits into an int.
*/
int i;
memcpy(&i, &u, sizeof i);

printf("Original number:  %d.\n", i);

We need to use an unsigned value to reconstruct the bits because C’s shift operators are not well defined for signed values, especially when we want to shift a bit into the sign bit. Once we have the bits in the unsigned object, we can copy them into an int.

Upvotes: 4

Related Questions