npn
npn

Reputation: 401

Integer overflow in intermediate arithmetic expression

This might be a very basic programming question, but it's something I've wanted to understand for some time.

Consider this simple example:

int main(void) 
{
  unsigned char a = 5;
  unsigned char b = 20;
  unsigned char m = 0xFF;

  unsigned char s1 = m + a - b;
  unsigned char s2 = m - b + a;
  printf("s1 %d s2 %d", s1, s2);
  return 0;
}

Given that arithmetic operators are evaluated from left to right in C, the first calculation here should overflow at m + a. However, running this program returns the same answer for s1 and s2. My question here is: does the first expression lead to undefined behavior because of the overflow? The second expression should avoid the overflow, but I wanted to understand why the two expressions return the same answer.

Upvotes: 0

Views: 900

Answers (4)

supercat
supercat

Reputation: 81257

Operations on types smaller than int are performed by converting the result to int, doing the computation, and then converting the result back to the original type. For small unsigned types, provided the result of the computation fits in type int, this will cause the upper bits of the result to be silently ignored. The published rationale for the Standard suggests the authors expected that non-archaic implementations would ignore the upper bits when storing a value into an unsigned type that isn't larger than int, without regard for whether the computation would fit in type int, but it is no longer fashionable for "modern" compilers to reliably behave in such fashion. On a system with 16-bit short and 32-bit int, for example, the function

unsigned mulMod65536(unsigned short x, unsigned short y)
{ return (x*y) & 0xFFFFu; }

will usually behave in a fashion equivalent to:

unsigned mulMod65536(unsigned short x, unsigned short y)
{ return (1u*x*y) & 0xFFFFu; }

but in some cases gcc will make "clever" optimizations based on the fact that it's allowed to behave in arbitrary fashion if x*y exceeds 2147483647, even though there's no reason the upper bits should ever affect the result.

Operations involving small signed types are similar to those using unsigned types, except that implementations are allowed to map values that exceed the range of smaller types into values of those types in Implementation-Defined fashion, or raise an Implementation-Defined signal if an attempt is made to convert an out-of-range value. In practice, nearly all implementations use two's-complement truncation even in this scenario. While some other behaviors might be cheaper in some situations, the Standard requires that implementations behave in a consistent documented fashion.

Upvotes: 0

Patrick Roberts
Patrick Roberts

Reputation: 51946

According to the ISO C specification §6.2.5.9

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

This means that both the would-be positive and negative overflows that seem to occur in your addition and subtraction respectively are actually performed as signed int so they are both well-defined. After the expression is evaluated, the result is then truncated back to an unsigned char since that's the left-hand result type.

Upvotes: 0

user5329483
user5329483

Reputation: 1272

Due to C's integer promotion the s1 calculation is effectively executed as:

unsigned char s1 = (unsigned char)( (int)m + (int)a - (int)b );

And there is no interim overflow.

Upvotes: 2

Doug
Doug

Reputation: 709

(Corrected) When doing arithmetic on integer types, all types smaller than int are promoted to int during the calculation, and then truncated back if the resulting type is smaller.

See:

https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules

Upvotes: 1

Related Questions