lbyeoksan
lbyeoksan

Reputation: 123

Strange behavior of bit operation in C

I'm studying C programming language and its bit operators. I've written codes like below and I expected that the result of the codes are same. But the reality is not.

#include <stdio.h>
#define N 0

int main() {
    int n = 0;
    printf("%d\n", ~0x00 + (0x01 << (0x20 + (~n + 1))));
    printf("%d\n", ~0x00 + (0x01 << (0x20 + (~N + 1))));
    return 0;
}

I assumed that the machine represent numbers as 2's complement on 32-bit. They both have to be -1 which is all bits are 1 but first one is 0 and second one is -1. I think both are exactly same code except whether using variable or constant.

I used gcc with option -m32 on Mac of i5 CPU.

What's wrong with it?

Thanks.

Upvotes: 1

Views: 217

Answers (2)

Jens
Jens

Reputation: 9130

The short answer

You're evaluating the same expression in two different ways—once at runtime on an x86, and once at compile time. (And I assume you've disabled optimizations when you compile, see below.)

The long answer

Looking at the disassembled executable I notice the following: the argument to the first printf() is computed at runtime:

movl   $0x0,-0x10(%ebp)
mov    -0x10(%ebp),%ecx  ; ecx = 0 (int n)
mov    $0x20,%edx        ; edx = 32
sub    %ecx,%edx         ; edx = 32-0 = 32
mov    %edx,%ecx         ; ecx = 32
mov    $0x1,%edx         ; edx = 1
shl    %cl,%edx          ; edx = 1 << (32 & 31) = 1 << 0 = 1
add    $0xffffffff,%edx  ; edx = -1 + 1 = 0

The shift is performed by an x86 SHL instruction with %cl as its operator. As per Intel manual: "The destination operand can be a register or a memory location. The count operand can be an immediate value or register CL. The count is masked to five bits, which limits the count range to 0 to 31. A special opcode encoding is provided for a count of 1."

For the above code, that means that you're shifting by 0, thus leaving the 1 in place after the shift instruction.

In contrast, the argument to the second printf() is essentially a constant expression that is computed by the compiler, and the compiler does not mask the shift amount. It therefore performs a "correct" shift of a 32b value: 1<<32 = 0 It then adds -1 to that—and you see the 0+(-1) = -1 as a result.

This also explains why you see only one warning: left shift count >= width of type and not two, as the warning stems from the compiler evaluating the shift of a 32b value by 32 bits. The compiler did not issue any warning regarding the runtime shift.

Reduced test case

The following is a reduction of your example to its essentials:

  #define N 0
  int n = 0;

  printf("%d %d\n", 1<<(32-N) /* compiler */, 1<<(32-n) /* runtime */);

which prints 0 1 demonstrating the different results of the shift.

A word of caution

Note that the above example works only in -O0 compiled code, where you don't have the compiler optimize (evaluate and fold) constant expressions at compile time. If you take the reduced test case and compile it with -O3 then you get the same and correct results 0 0 from this optimized code:

movl   $0x0,0x8(%esp)
movl   $0x0,0x4(%esp)

I would think that if you change the compiler options for your test, you will see the same changed behavior.

Note There seems to be a code-gen bug in gcc-4.2.1 (and others?) where the runtime result is just off 0 8027 due to a broken optimization.

Upvotes: 5

chux
chux

Reputation: 153338

A simplified example

unsigned n32 = 32;
printf("%d\n", (int) sizeof(int));  // 4
printf("%d\n",  (0x01 << n32));     // 1
printf("%d\n",  (0x01 << 32));      // 0

You get UB in (0x01 << n32) as the shift >= width of int. (Looks like only 5 lsbits of n32 participated in the shift. Hence a shift of 0.)

You get a UB in (0x01 << 32) as the shift >= width of int. (Looks like complier performed the math with more bits.) This UB could have been the same as above.

Upvotes: 5

Related Questions