Reputation: 6158

Why does GCC treat two similar loops differently?

Note:

If I understand the following code right, it will skip the whole loop, because when compare unsigned (j) and signed (-1), it seems that the -1 will be convert to UINT_MAX. (like this question explained)

The first loop:

unsigned int j = 10;

for (; j > -1; --j) {     --->  `>`
    printf("%u", j);
}

Part of assembly code of first loop:

movq    %rsp, %rbp
.cfi_def_cfa_register 6
movl    %edi, -20(%rbp)
movq    %rsi, -32(%rbp)
movl    $10, -4(%rbp)
nop                           --->**elision**
popq    %rbp
.cfi_def_cfa 7, 8
ret

The second loop of second loop:

unsigned int j = 10;

for (; j >= -1; --j) {  --->  `>=`
    printf("%u", j);
}

Part of assembly code:

movq    %rsp, %rbp
.cfi_def_cfa_register 6
subq    $32, %rsp
movl    %edi, -20(%rbp)
movq    %rsi, -32(%rbp)
movl    $10, -4(%rbp)
jmp .L2                        --->** still a loop here **

.L3:

movl    -4(%rbp), %eax
movl    %eax, %esi
movl    $.LC0, %edi
movl    $0, %eax
call    printf
subl    $1, -4(%rbp)

.L2:

cmpl    $-1, -4(%rbp)
je  .L3
leave
.cfi_def_cfa 7, 8
ret

So my question is

Why the gcc (I use GCC: (Ubuntu 4.8.2-19ubuntu1) 4.8.2) treat a similar situation, it optimize the first one but didn't the second? (* Or my understand of the code is wrong?) (it has something to do with the assembly?)

Edit: You can go to this site to check. If you just use -S compiler option(or no compiler option) you will get the same result as I do. (Thanks @Raymond Chen for reminder)

step 1:

Open above site and copy the following code to Code Eidtor.

 #include <stdio.h>
 int main (int argc, char *argv[]) {

   unsigned int j = 10;

   for (; j > -1; --j) {    
      printf("%u", j);
   }
 }

step 2:

Choose g++ 4.8 as compiler. Compiler option is empty.(or -S)

step 3:

You get the first situation. Now, change the j > -1 to j >= -1 and you can see the second.

Upvotes: 2

Answers (3)

AnT stands with Russia

Reputation: 320671

It looks like the compiler did not attempt to take into account the "known" value of j from the previous initialization. It simply interpreted the cycles independently, under assumption that the initial value of j can be anything.

Under such circumstances these two cycles are not even remotely similar.

The first cycle is an "impossible" cycle. It contains iteration condition j > -1, which will be interpreted as j > UINT_MAX. Obviously, this is an impossible condition. So, the compiler decided to completely eliminate the cycle.

The second cycle's condition is not impossible. It is equivalent to j >= UINT_MAX and it is perfectly satisfiable for j == UINT_MAX. So, the second cycle is straightforwardly translated into a perfectly normal code that calls your printf.

Obviously, the cycle in the second version will never make more than one iteration, meaning that there's no real need for an actual cycle. By this is not something the compiler was able (or even tried) to figure out by itself. You asked for a cycle - it gave you a cycle.

Upvotes: 1

david.pfx

Reputation: 10853

The applicable conversion is described in the C standard n1570 S6.3.1.3 as follows:

...if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

So -1 is converted to UINT_MAX which for 32-bit arithmetic is 0xffffffff. This is the same bit pattern, so in assembly language terms it's a no-op.

In the first case the compiler can establish that the loop exit condition is trivially true for all values of the loop variable. No further analysis is needed and at a suitable level of optimisation the loop should be elided.

In the second case there is no such trivial analysis available. However, if the compiler performs a data flow analysis it will discover that the loop exit condition is satisfied before the loop is entered. At a suitable (but perhaps different) level of optimisation this loop can also be elided.

The analysis required is different in each case and harder in the second case. However, I would not care to predict which compilers would perform loop elision in which cases. You would have to test them to find out (as you did).

A note on terminology: the term elision is a more precise description when a compiler decides to omit code entirely. The term optimisation is better used when the compiler makes choices between different possible code generation strategies, perhaps choosing between speed and space.

Upvotes: 4

davmac

Reputation: 20641

Probably because 'j > -1' cannot be true for any value of j, whereas 'j >= -1' can be true if j == UINT_MAX. So there is a subtle difference which affects optimisation. In the first case the condition and therefore the loop can be trivially optimised away; in the second case, a slightly more complex analysis is required.