Reputation: 359

Integer casting bug?

Consider the following code:

#include <iostream>
using namespace std;

int main() {
    // the following is expected to not print 4000000000
    // because the result of an expression with two `int`
    // returns another `int` and the actual result 
    // doesn't fit into an `int` 
    cout << 2 * 2000000000 << endl; // prints -294967296

    // as such the following produces the correct result
    cout << 2 * 2000000000U << endl; // prints 4000000000
}

I played a bit around with casting the result to different integer types, and came accross some weird behavior.

#include <iostream>
using namespace std;

int main() {
    // unexpectedly this does print the correct result
    cout << (unsigned int)(2 * 2000000000) << endl; // prints 4000000000

    // this produces the same wrong result as the original statement
    cout << (long long)(2 * 2000000000) << endl; // prints -294967296
}

I expected both of the following statements to not produce the correct result, how come one did succeed and the other didn't?

Upvotes: 2

Answers (5)

Csq

Reputation: 5855

In C++ the type of an expression does not depend on the code environment (usually).

Therefore the subexpression 2 * 2000000000 has the same type and value on the same system, no matter what the context of the containing expression, it is int (as both operands of the * operator are ints). And it would 4000000000, but on your architecture it changed it changed to -294967296 because of an overflow.

Casting it to long long won't change the value, because the long long can represent -294967296 just fine.

Actually it is much more interesting that cout << (unsigned int)(2 * 2000000000) << endl; works. As unsinged int cannot hold -294967296, overflow happens again. -294967296 and 4000000000 are congruent modulo 2^32 so this will be the new value. (Updated from the better answer of GManNickG).

To illustrate the deeper problem you can try

cout << (unsigned int)(2 * 2000000000 / 2) << endl;

The division will be executed on -294967296 and the binary representation of -147483648 will be converted to unsigned which is 4147483648

Upvotes: 1

GManNickG

Reputation: 503835

Way too much confusion going on in people trying to answer this question.

Let's examine:

2 * 2000000000

This is an int multiplied by an int. §5/4 tells us:

If during the evaluation of an expression, the result is not mathematically deﬁned or not in the range of representable values for its type, the behavior is undeﬁned.

This result is mathematically defined, but is it in the range of representable values for int?

That depends. On many common architectures int has 32 bits to represent values, giving it a maximum value of 2,147,483,647. Since the mathematical result of this is 4,000,000,000, such an architecture would not be able to represent the value and the behavior is undefined. (This pretty much kills the question, because now the behavior of the entire program is undefined.)

But that's just dependent on the platform. If int was 64 bits wide instead (note: long long is guaranteed to have at least 64 bits to represent values), the result would fit just fine.

Let's just fix up the problem a bit though and go straight to this:

int x = -294967296; // -294,967,296

And let's further say this fits within the range of int (which for 32 bit int it does).

Now let's cast this to an unsigned int:

unsigned int y = static_cast<unsigned int>(x);

What is the value of y? It has nothing to do with the bit representation of x.

There is no "bit cast" where the compiler simply treats the bits as an unsigned quantity. Conversions work with values. The value of a signed int converted to an unsigned int is defined in §4.7/2:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2ⁿ where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]

For us on our 32-bit (unsigned) int system, this means 4000000000. This works regardless of bits: two's-compliment, one's-compliment, magic's-compliment, etc. These are irrelevant.

The reason you see the value you wanted in the first palce (ignoring UB) is that on your two's-compliment machine, the difference between signed and unsigned integers is indeed a matter of viewing bits differently. So when you multiplied those two int's, you were "really" multiplying two unsigned integers, ignoring the overflow, and viewing the result as a signed integer. Then the cast changes your view once more.

But the casting works independently of bits!

Upvotes: 4

moooeeeep

Reputation: 32502

Note that signed integer overflow is undefined behavior. As a conclusion, anything can happen. Including innocently correct results.

Both integer literals 2 and 2000000000 are 32bit wide. The result will overflow, as your compiler tells you:

warning: integer overflow in expression [-Woverflow]

The result of the multiplication is still a 32bit signed integer. And, in this case, the result of the overflow is luckily the correct result, when viewed as an unsigned 32bit integer. You can observe this when casting the bit pattern to a 32bit unsigned int.

However, if you cast the value to an integer type of a larger width (e.g. 64bit), the leading bytes will be padded with ff (sign extension), and thus giving false results.

#include <iostream>

int main() {
    long long x = 2 * 2000000000;     // 8 byte width
    unsigned int y = 2 * 2000000000;  // 4 byte width
    unsigned long z = 2 * 2000000000; // 8 byte width
    std::cout << std::hex << x << " " << std::dec << x << std::endl;
    // output is: ffffffffee6b2800 -294967296
    std::cout << std::hex << y << " " << std::dec << y << std::endl;
    // output is: ee6b2800 4000000000
    std::cout << std::hex << z << " " << std::dec << z << std::endl;
    // output is: ffffffffee6b2800 18446744073414584320

}

Upvotes: 0

Roberto

Reputation: 2185

In the third (weird) case, the running program does this:

2 * 2000000000       = binary number (11101110011010110010100000000000)
print it as unsigned = 4000000000 
                   (interprets the first bit (1) as part of the unsigned number)

The fourth case:

2 * 2000000000       = binary number (11101110011010110010100000000000, same as above) 
print it as signed   = -294967296 
                   (interprets the first bit (1) as negative number)

The important thing to learn is that the expression 2 * 2000000000 results in a byte sequence, and then it is interpreted as the cast operation says.

Upvotes: 0

Caesar

Reputation: 9841

In an int, the value of 4,000,000,000 is written as 1110 1110 0110 1011 0010 1000 0000 0000

In an unsigned int, the value of 4,000,000,000 is written as 1110 1110 0110 1011 0010 1000 0000 0000

Looking at these two, you can see that they are the same.

The difference comes from the way the bits are read in a int and unsigned int. in a regular int the most significant bit is used to tell if the number is negative or not.

Upvotes: 3

Integer casting bug?

Answers (5)

Related Questions