Reputation: 401
This might be a very basic programming question, but it's something I've wanted to understand for some time.
Consider this simple example:
int main(void)
{
unsigned char a = 5;
unsigned char b = 20;
unsigned char m = 0xFF;
unsigned char s1 = m + a - b;
unsigned char s2 = m - b + a;
printf("s1 %d s2 %d", s1, s2);
return 0;
}
Given that arithmetic operators are evaluated from left to right in C, the first calculation here should overflow at m + a. However, running this program returns the same answer for s1 and s2. My question here is: does the first expression lead to undefined behavior because of the overflow? The second expression should avoid the overflow, but I wanted to understand why the two expressions return the same answer.
Upvotes: 0
Views: 900
Reputation: 81257
Operations on types smaller than int
are performed by converting the result to int
, doing the computation, and then converting the result back to the original type. For small unsigned types, provided the result of the computation fits in type int
, this will cause the upper bits of the result to be silently ignored. The published rationale for the Standard suggests the authors expected that non-archaic implementations would ignore the upper bits when storing a value into an unsigned type that isn't larger than int
, without regard for whether the computation would fit in type int
, but it is no longer fashionable for "modern" compilers to reliably behave in such fashion. On a system with 16-bit short and 32-bit int
, for example, the function
unsigned mulMod65536(unsigned short x, unsigned short y)
{ return (x*y) & 0xFFFFu; }
will usually behave in a fashion equivalent to:
unsigned mulMod65536(unsigned short x, unsigned short y)
{ return (1u*x*y) & 0xFFFFu; }
but in some cases gcc will make "clever" optimizations based on the fact that
it's allowed to behave in arbitrary fashion if x*y
exceeds 2147483647, even though there's no reason the upper bits should ever affect the result.
Operations involving small signed types are similar to those using unsigned types, except that implementations are allowed to map values that exceed the range of smaller types into values of those types in Implementation-Defined fashion, or raise an Implementation-Defined signal if an attempt is made to convert an out-of-range value. In practice, nearly all implementations use two's-complement truncation even in this scenario. While some other behaviors might be cheaper in some situations, the Standard requires that implementations behave in a consistent documented fashion.
Upvotes: 0
Reputation: 51946
According to the ISO C specification §6.2.5.9
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
This means that both the would-be positive and negative overflows that seem to occur in your addition and subtraction respectively are actually performed as signed int
so they are both well-defined. After the expression is evaluated, the result is then truncated back to an unsigned char
since that's the left-hand result type.
Upvotes: 0
Reputation: 1272
Due to C's integer promotion the s1 calculation is effectively executed as:
unsigned char s1 = (unsigned char)( (int)m + (int)a - (int)b );
And there is no interim overflow.
Upvotes: 2
Reputation: 709
(Corrected) When doing arithmetic on integer types, all types smaller than int are promoted to int during the calculation, and then truncated back if the resulting type is smaller.
See:
https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules
Upvotes: 1