banj
banj

Reputation: 71

Compiling with optimization gets a condition wrong

This piece of code has a bug in the loop termination condition. However, I still don't understand the compiler's decision - it seems to be getting into the loop one more time.

#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[])
{
    #define ARR_SIZE 25
    int a[ARR_SIZE];
    memset (a,1,sizeof(a)); /*filling the array with non-zeros*/

    int i = 0;

    for (i=0; (a[i] != 0 && i < ARR_SIZE); i++)
    {
        printf ("i=%d a[i]=%d\n",i,a[i]);
    }
    return 0;
}

When compiling this with -O2 or -O3 the loop is not terminated when expected - it prints also a line for when i == ARR_SIZE.

> gcc -O3  test_loop.c
> ./a.out
i=0 a[i]=16843009
i=1 a[i]=16843009
...
i=23 a[i]=16843009
i=24 a[i]=16843009
i=25 a[i]=32766      <=== Don't understand this one.

>  gcc -O0  test_loop.c
>  a.out
i=0 a[i]=16843009
i=1 a[i]=16843009
...
i=23 a[i]=16843009
i=24 a[i]=16843009
>

The gcc version is this: gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)

I don't see this happening on gcc 4.4.7-18.

Also other sizes of ARR_SIZE doesn't give the same results.

Upvotes: 4

Views: 143

Answers (2)

tevemadar
tevemadar

Reputation: 13205

Modern compilers notice undefined behaviour and they can use it as an excuse for generating "unexpected" code, and that is what you are experiencing. See https://godbolt.org/z/SEZKBZ, you have to go back to 4.6.x in order to have the i < ARR_SIZE comparison appear in the optimized compiled code (which is not very optimized for that old version in fact):

...
call    printf
lea     eax, [rbx+1]
mov     edx, DWORD PTR [rsp+4+rbx*4]
cmp     eax, 24
setle   cl
test    edx, edx
setne   al
add     rbx, 1
test    cl, al
jne     .L3

Higher versions contain the zero-test only:

...
call    printf
mov     edx, DWORD PTR [rsp+rbx*4]
test    edx, edx
jne     .L8

If you check the first part of the optimized compiled codes, you will see that the memset() call got inlined and unrolled, so the compiler exactly knows what is in the array and that the loop condition will index out of it (the array) prior to exiting (as there is no zero inside). And then it does not care any more about the other condition.


Similarly, if you fix the code to a[i] != 0 && i < ARR_SIZE as suggested, the compiler still knows that there are no zeroes in the array, and optimizes the zero-check away, just this time the optimization of correct code leads to correct behaviour:

call    printf
cmp     rbx, 25
je      .L6

Upvotes: 3

pmg
pmg

Reputation: 108968

When i == ARR_SIZE your condition will evaluate a[i] invoking UB

for (i=0; (a[i] != 0 && i < ARR_SIZE); i++)
//         ^^^^ Undefined Behaviour
{
    printf ("i=%d a[i]=%d\n",i,a[i]);
}

Swap the conditions: for (... (i < ARR_SIZE && a[i] != 0) ...) to take advantage of "short-circuit boolean evaluation".

Upvotes: 7

Related Questions