geometrian
geometrian

Reputation: 15407

reinterpret_cast rvalue and optimization

I am converting a bunch of code over to use C++-style casts (with the help of -Wold-style-cast). I'm not entirely sold on its use for primitive variables, but I'm new to C++-style casts in general.

One issue occurs in some endian converting code. The current code looks like this:

#define REINTERPRET_VARIABLE(VAR,TYPE) (*((TYPE*)(&VAR)))

//...

uint16_t reverse(uint16_t val) { /*stuff to reverse uint16_t*/ }
 int16_t reverse( int16_t val) {
    uint16_t temp = reverse(REINTERPRET_VARIABLE(val,uint16_t));
    return REINTERPRET_VARIABLE(temp,int16_t);
}    

Now, endianness doesn't care about signedness. Therefore, to reverse an int16_t, we can treat it exactly like a uint16_t for the purposes of the reversal. This suggests code like this:

 int16_t reverse( int16_t val) {
    return reinterpret_cast<int16_t>(reverse(reinterpret_cast<uint16_t>(val)));
}

However, as described in this and in particular this question, reinterpret_cast requires a reference or a pointer (unless it's casting to itself). This suggests:

 int16_t reverse( int16_t val) {
    return reinterpret_cast<int16_t&>(reverse(reinterpret_cast<uint16_t&>(val)));
}

This doesn't work because, as my compiler tells me, the outside cast wants an lvalue. To fix this, you'd need to do something like:

 int16_t reverse( int16_t val) {
    uint16_t temp = reverse(reinterpret_cast<uint16_t&>(val));
    return reinterpret_cast<int16_t&>(temp);
}

This is not much different from the original code, and indeed the temporary variable exists for the same reason, but four questions were raised for me:

  1. Why is a temporary even necessary for a reinterpret_cast? I can understand a dumb compiler's needing to have a temporary to support the pointer nastiness of REINTERPRET_VARIABLE, but reinterpret_cast is supposed to just reinterpret bits. Is this clashing with RVO or something?
  2. Will requiring that temporary incur a performance penalty, or is it likely that the compiler can figure out that the temporary really should just be the return value?
  3. The second reinterpret_cast looks like it's returning a reference. Since the function return value isn't a reference, I'm pretty sure this is okay; the return value will be a copy, not a reference. However, I would still like to know what casting to a reference really even means? It is appropriate in this case, right?
  4. Are there any other performance implications I should be aware of? I'd guess that reinterpret_cast would be, if anything, faster since the compiler doesn't need to figure out that the bits should be reinterpreted--I just tell it that they should?

Upvotes: 6

Views: 1840

Answers (2)

M.M
M.M

Reputation: 141613

  1. temp is required because the & (address-of) operator is applied to it on the next line. This operator requires an lvalue (the object to take the address of).

  2. I'd expect the compiler to optimize it out.

  3. reinterpret_cast<T&>(x) is the same as * reinterpret_cast<T *>(&x), it is an lvalue designating the same memory location as x occupies. Note that the type of an expression is never a reference; but the result of casting to T&, or of using the * operator is an lvalue.

  4. I wouldn't expect any performance issues.

There are no strict aliasing problems with this particular piece of code, because it is allowed to alias an integer type as the signed or unsigned variation of the same type. But you suggest the codebase is full of reinterpret casts, so you should keep your eye out for strict aliasing violations elsewhere, perhaps compile with -fno-strict-aliasing until it is sorted out.

Upvotes: 3

geometrian
geometrian

Reputation: 15407

Since no one has answered this with language-lawyery facts in two years, I'll answer it instead with my educated guesses.

  1. Who knows. But it's apparently necessary, as you've surmised. To avoid issues with strict aliasing, it would be safest to use memcpy, which will be optimized correctly by any compiler.

  2. The answer to any such question is always to profile it and to check the disassembly. In the example you gave, e.g. GCC will optimize it to:

    reverse(short):
        mov     eax, edi
        rol     ax, 8
        ret
    

    Which looks pretty optimal (the mov is for copying from the input register; if you inline your function and use it, you'll see it is absent entirely).

  3. This is a language lawyer question. Probably has some useful semantic meaning. Don't worry about it. You haven't written code like this since.

  4. Again, profile. Maybe reinterpret casting gets in the way of certain optimizations. You should follow the same guidelines as you would for strict aliasing, mentioned above.

Upvotes: 2

Related Questions