FlyingTeller
FlyingTeller

Reputation: 20517

Understanding shift operations on too small types

I had an issue in my code with bitwise shifting in C that I boiled down to the following example:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>

int main(int argc, char **argv)
{
  char a = 1;
  int i=0;
  uint64_t b;
  for(i=0; i<64; i++)
  {
    b=0;
    printf("i=%d\n", i);
    b = (a<< ((uint64_t) i));
    printf("%" PRIu64 "\n", b);
  }
  return 0;

}

For reasons not apparent in this MWE a is a char from which I wanted to generate the powers of 2 up to 2^63. It failed because something weird was going on because a as not uint64_t. The obvious fix was

b = ((uint64_t)a<< ((uint64_t) i))

To understand what exactly was going on I wrote above MWE example from which I get the output (shown partially):

i=30  
1073741824  
i=31  
18446744071562067968  
i=32  
1  
i=33  
2  

Now I am wondering how can I explain the jump from i=30 to i=31? And how does it end up being 1 again at i=32?

If it is of interest, I compiled above code using gcc (gcc (SUSE Linux) 4.8.5)

Upvotes: 1

Views: 129

Answers (4)

dbush
dbush

Reputation: 223992

You're performing a shift that's greater than the size of the type in question.

First, a (which is of type char) is promoted to type int. This is detailed in section 6.3.1.1 of the C standard:

The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

Assuming an int is 32 bits on your system, shifting left by more than 30 bits invoked undefined behavior. This is detailed in section 6.5.7 of the standard:

3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined

4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Upvotes: 3

4386427
4386427

Reputation: 44274

A lot of answers already address that your code have undefined behavior. Therefore it doesn't in general make sense to reason about what happened (as anything could have happened...).

But.... sometimes it's actually quite interesting to do anyway - keeping in mind that it is pure guessing and not guaranteed to be correct and highly system dependent and ....

So with all the disclaimers in place...

Where could the odd number 18446744071562067968 come from?

Let's assume 32 bit intand also notice that your char a = 1; could just as well be int a = 1; because of integer promotion. So we can write:

int a = 1;
int as = a << 31;  // Undefined behavior here as 1*2^31 can't be stored in 32 bit int
uint64_t b = as;
printf("%" PRIu64 "\n", b);

Output:

18446744071562067968

Hmmm.... there we have that mysterious 18446744071562067968 but why?

Let's do one more printing:

int a = 1;
int as = a << 31;  // Undefined behavior here
uint64_t b = as;
printf("%d\n", as);
printf("%" PRIu64 "\n", b);

Output

-2147483648
18446744071562067968

So as is negative so we really did:

uint64_t b = -2147483648;

Since b is unsigned the above is calculated as:

uint64_t b = UINT64_MAX + 1 - 2147483648;  // which is 18446744071562067968

So now we know where 18446744071562067968 came from. Not that mysterious after all.

But this leaves another question - why is as negative?

Hmmm.... let's do one more printing:

int a = 1;
int as = a << 31;  // Undefined behavior here
uint64_t b = as;
printf("%d\n", as);
printf("%x\n", as);  // Print as in hex
printf("%" PRIu64 "\n", b);

Output:

-2147483648
80000000
18446744071562067968

So in hex as is 80000000 which is in fact a 1 left shifted 31 times. So the processor simply did what we asked for, i.e. 1 << 31 It is undefined by the C standard but your/my processor was simply implemented to do like that - other machines may do something else.

80000000 as a 32 bit 2's complement is -2147483648 and thereby we know why as is negative.

Upvotes: 1

llllllllll
llllllllll

Reputation: 16404

Because this is an undefined behavior.

C99

shift-expression:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

Here your char a is promoted to int[See here why]. And in your machine, the width of int is not bigger than 63, thus according to the standard, the behavior is undefined.

Upvotes: 1

Bathsheba
Bathsheba

Reputation: 234715

The (somewhat unexpected) rule that kicks in here is that, when it comes to <<,

the type of the result is that of the promoted left operand

This means that the type of a << (uint64_t) i is an int, as the char type for a is automatically widened to an int.

It appears that your int is a 32 bit 2's complement signed type. Therefore for i greater or equal to 31, the behaviour of the expression is undefined.

Upvotes: 3

Related Questions