Reputation: 29
I know that signed int
overflow is undefined in C.
I understood this as the resulting number is not predictable, but it is still a number.
However, this code does not behave like the result of an overflow operation is an int number:
#include<stdio.h>
int main ()
{
int i,m=-727379968,w=1,n;
for (i=1;i<=12;i++)
w*=10;
n=w;
printf("sizeof(int)=%d\n",(int)sizeof(int));
printf ("m=%d n=%d m-n=%d (1<m)=%d (1<n)=%d (m==n)=%d \n", m,n, m-n, 1<m, 1<n,m==n);
return (0);
}
Using gcc version 11.4.0 no optimization:
sizeof(int)=4
m=-727379968 n=-727379968 m-n=0 (1<m)=0 (1<n)=0 (m==n)=1
This is OK.
With gcc -O2
:
sizeof(int)=4
m=-727379968 n=-727379968 m-n=0 (1<m)=0 (1<n)=1 (m==n)=0
This is wrong.
Why is 1 less than a negative number?
Why are m
and n
not equal when the difference is 0?
I expected that after the assignment n=w
, there is some int value in n
. But I do not understand the results.
Edit: Using gcc 14.2.0 (both with and without -O2) gives the correct result:
sizeof(int)=4
m=-727379968 n=-727379968 m-n=0 (1<m)=0 (1<n)=0 (m==n)=1
So it seems there was a bug in gcc 11.4.0.
O.K. we call it a bug if we expect undefined value
. If we admitted undefined behavior
we would not call it a bug.
Upvotes: 1
Views: 186
Reputation: 2447
In C, the moment a signed int
is overflowed (if it exceeds its maximum or minimum representable value), its behavior is undefined. This is supremely critical. It doesn't simply translate to "the result is unexpected," it also implies the compiler can, as a matter of fact, not expect any overflow to occur whatsoever.
When you try to compile with optimizations like -O2
, the compiler tends to utilize this. The w *= 10
loop within the code will indeed set off the ‘w’ overflow. The compiler, on the other hand, operating under the premise that overflow does not take place, draws conclusions about the conclusion of w
(and therefore n
which is claimed w
).
The compiler assumes both w
and n
will be positive, which is why it deductively optimizes the comparisons 1 < n
and 1 < m
. It may even falsely optimize m == n
. The observed effects arise not due to the overflow peculiarly generating a specific value, but because the compiler, in this scenario, did not perform the optimization correctly because of its assumption.
The critical point here is: the undefined behavior permits the compiler the privilege of making assumptions that can invoke baffling consequences when those assumptions are disregarded during run time.
Upvotes: 0
Reputation: 181149
I know that
signed int
overflow is undefined in C.
Yes, in the sense that if evaluation of an expression of type signed int
is specified to produce a value that is not in range for that type, then the behavior is undefined.
I understood this as the resulting number is not predictable but it is still a number.
"Undefined behavior" means that the language places no requirements on the program's behavior. Many parties, including implementors of major compilers, have interpreted that very expansively, contrary to your expressed understanding. C23 tries to rein that in at least a little, but probably not enough to support your understanding. What you describe would be expressed in standardese as something more along the lines of "the value is unspecified".
Expansive interpretations of undefined behavior allow compilers to perform optimizations that are safe if and only if the program's behavior is well defined (so, among other things, only if there is no signed integer overflow in any evaluated expression). If in fact there is UB then any kind of seemingly inconsistent behavior is consistent with the language spec, because there are no constraints on the program's behavior in such cases.
For example, the compiler can observe that
m
's initial value is negative and does not changen
's value is computed, in w
, as a product of positive numbers, at least one of which is greater than 1If it supposes that it can produce whatever behavior is convenient to it in the event that there is integer overflow, then it can conclude at compile time that 1<m
will evaluate to 0, 1<n
will evaluate to 1, and m==n
will evaluate to 0, and therefore produce output that so indicates. As far as C is concerned, that is not in conflict with also producing output that shows the difference between m
and n
as zero, or that shows the same, negative, value for both m
and n
. Because UB.
Upvotes: 4
Reputation: 782099
The optimizer has removed the comparisons 1 < n
and m == n
and replaced them with known values.
The optimizer assumes that undefined behavior will never occur, which means that it ignores the possibility of overflow. So it "knows" that multiplying 1 by 10 multiple times will result in a large positive value (1013 in this case). Therefore, 1 < n
will always be true. And since m
is a negative number, m == n
can never be true.
Upvotes: 6