Reputation: 23497
In the current C++ Standard Draft, the left shift operator is defined as follows [expr.shift]:
The value of
E1 << E2
is the unique value congruent toE1×2^E2
modulo2^N
, whereN
is the width of the type of the result.
Consider int E1 = 2^31-1 = 2'147'483'647
, E2 = 1
, and int
having 32 bits. Then there is an infinite number of numbers congruent to E1×2^E2 = 4'294'967'294
modulo 2^N = 2^32
, namely, all the numbers 4'294'967'294 + k×2^32
where k
is an arbitrary integer. Examples are 4'294'967'294
(k=0
) or -2
(k=-1
).
I don't understand what the Standard means by the unique value out of these numbers. Does it mean the unique value that can be represented by the resulting data type? Then, I suppose the result is defined as -2
. Is this interpretation correct?
Until C++20, the definition was different and this case would cause undefined behavior. I suppose the change is related to the mandatory 2's-complement representation of negative signed integers.
In fact, there is now no more requirement for E1
to be non-negative. It therefore seems that -1 << 1
is defined as -2
. Is that right as well?
Upvotes: 17
Views: 2186
Reputation: 75727
Does it mean the unique value that can be represented by the resulting data type
Yes. The set of numbers congruent to E1×2^E2
modulo 2^N
is infinite, but there is only one value in any interval of size 2^N
, therefore there is only one value representable in an integer type of width N
.
If we look in the "p0907R1 Signed Integers are Two’s Complement" proposal we find a similar phrase with "unique representation" which makes this more clear:
Conversion from signed to unsigned is always well-defined: the result is the unique value of the destination type that is congruent to the source integer modulo 2N.
Then, I suppose the result is defined as
-2
. Is this interpretation correct?
Yes
On x64 the equivalent asm instruction is shlx
(logical shift left)
I suppose the change is related to the mandatory 2-complement representation of negative signed integers.
Correct. As was the case with unsigned types, now also signed types they mathematically represent equivalence classes (well, it's not clear to me how much this is true as it looks like they want to still keep some UB cases on overflow).
Upvotes: 10
Reputation: 141200
So we know that:
E1 = 2147483647
E2 = 1
N = sizeof(int) * CHAR_BIT = 4 * 8 = 32
Let's compute E1×2^E2 modulo 2^N
(modulo is the remainder of the division):
x = E1×2^E2 mod 2^N = 2147483647 * 2 ^ 1 mod 4294967296 = 4294967294 mod 4294967296 = 4294967294
Then we go to here:
For each value x of a signed integer type, the value of the corresponding unsigned integer type congruent to x modulo 2 N has the same value of corresponding bits in its value representation.
and I think we also need:
The base-2 representation of a value of signed integer type is the base-2 representation of the congruent value of the corresponding unsigned integer type.
That means, that x = 4294967294
is equal to x = -2
for signed int
. So the result will be -2
.
It therefore seems that -1 << 1 is defined as -2. Is it right as well?
(signed)-1 << 1 =
4294967295 << 1 =
4294967295 * 2 ^ 1 mod 4294967296 =
8589934590 mod 4294967296 =
4294967294 =
(signed)-2
Upvotes: 3