bolov
bolov

Reputation: 75904

Bit operations with integer promotion

tl;dr Is bit manipulation safe and behaving as expected when it goes through integer promotion (with types shorter than int)?

e.g.

uint8_t a, b, c;
a = b & ~c;

This is a rough MCVE of what I have:

struct X { // this is actually templated
  using U = unsigned; // U is actually a dependent name and can change
  U value;
};

template <bool B> auto foo(X x1, X x2) -> X
{
  if (B)
    return {x1.value | x2.value};
  else
    return {x1.value & ~x2.value};
}

This works great, but when U is changed to a integer type shorter than int, e.g. std::uint8_t then due to integer promotions I get a warning:

warning: narrowing conversion of '(int)(((unsigned char)((int)x1.X::value)) | ((unsigned char)((int)x2.X::value)))' from 'int' to 'X::U {aka unsigned char}' inside { } [-Wnarrowing]

So I added a static_cast:

struct X {
  using U = std::uint8_t;
  U value;
};

template <bool B> auto foo(X x1, X x2) -> X
{
  if (B)
    return {static_cast<X::U>(x1.value | x2.value)};
  else
    return {static_cast<X::U>(x1.value & ~x2.value)};
}

The question: Can the integer promotion and then the narrowing cast mess with the intended results (*)? Especially since these are casts change signedness back and forward (unsigned char -> int -> unsigned char). What about if U is signed, i.e. std::int8_t (it won't be signed in my code, but curious about the behavior if it would be).

My common sens says the code is perfectly ok, but my C++ paranoia says there is at least a chance of implementation defined behavior.

(*) is case it's not clear (or I messed up) the intended behavior is to set or clear the bits (x1 is the value, x2 is the mask, B is the set/clear op)

Upvotes: 2

Views: 361

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 149185

If you use unsigned types, all will be OK. The standard mandates that for unsigned target integer types, narrowing is perfectly defined:

4.7 Integral conversions [conv.integral]
...
2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).

But if the target type is signed, the result is implementation defined, per the next paragraph (emphasize mine):

3 If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.

In common implementations everything will be ok because it is simpler for the compiler to simply do narrowing conversions by only keeping low level bytes for either unsigned or signed types. But the standard only requires that the implementation defines what will happen. An implementation could document that narrowing a value to a signed type when the original value cannot be represented in the target type gives 0, and still be conformant.


By the way, as C++ and C often process conversions the same way, it should be noted that C standard is slightly different because the last case could raise a signal:

6.3.1.3 [Conversions] Signed and unsigned integers
...
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Still a confirmation that C and C++ are different languages...

Upvotes: 2

Related Questions