blipman17
blipman17

Reputation: 531

bitshifting unsigned char and unsigned long long gone wrong

This is a function I (want to) use to decode numbers out of unsigned char[] buffers for networking.

inline unsigned long long getULongLongLongInt(const unsigned char* buffer)
{
    unsigned long long number= (buffer[0] << 56);
    number|= (buffer[1] << 48);
    number|= (buffer[2] << 40);
    number|= (buffer[3] << 32);
    number|= (buffer[4] << 24);
    number|= (buffer[5] << 16);
    number|= (buffer[6] << 8);
    number|= buffer[7];
    return number;
}

I'm getting warning C4293 '<<': shift count negative or too big, undefined behaviour" four times for the most upper bitshifts; Is this a warning I can safely ignore because the compiler doesn't recognise I'm using an unsigned 64 bit int? I presume it isn't. But then how do I fix this?

Upvotes: 1

Views: 483

Answers (4)

sailfish009
sailfish009

Reputation: 2927

change code like this:

//unsigned long long number= (buffer[0] << 56);
unsigned long long number= ((buffer[0] << 31) << 25);

Upvotes: 0

Richard Hodges
Richard Hodges

Reputation: 69912

I think it's best to explicitly state intent and let the compiler's optimiser work its magic:

#include <cstdint>
#include <utility>

template<class Unsigned,
std::enable_if_t<std::is_unsigned<Unsigned>::value>* = nullptr
  >
inline Unsigned decode_int_msb(const unsigned char* buffer)
{
  //
  // some helpful types and constants
  //

  using type = Unsigned;
  static constexpr auto bytes = sizeof(type);
  static constexpr auto bits = bytes * 8;

  //
  // a helpful local function
  //

  auto byte = [buffer](auto i) {
    return type(buffer[i]) << bits - ((i+1) * 8);
  };

  //
  // simplified algorithm here
  //

  type acc = 0;
  for(std::size_t i = 0 ; i < bytes ; ++i)
    acc += byte(i);
    return acc;
}

int main(int argc, char** argv)
{

  // force expansion of some templates

  auto x =  decode_int_msb<unsigned long long>(reinterpret_cast<const unsigned char*>(argv[0]));
  auto y =  decode_int_msb<unsigned long>(reinterpret_cast<const unsigned char*>(argv[0]));
  auto z =  decode_int_msb<unsigned short>(reinterpret_cast<const unsigned char*>(argv[0]));
  return x + y + z;
}

example assembler output of the unsigned long long version (when not inlined):

unsigned long decode_int_msb<unsigned long, (void*)0>(unsigned char const*):
        movzx   ecx, BYTE PTR [rdi+1]
        movzx   edx, BYTE PTR [rdi+2]
        movzx   eax, BYTE PTR [rdi+3]
        mov     rsi, rcx
        sal     rdx, 40
        sal     rsi, 48
        sal     rax, 32
        lea     rcx, [rsi+rdx]
        lea     rdx, [rcx+rax]
        movzx   eax, BYTE PTR [rdi]
        sal     rax, 56
        add     rax, rdx
        movzx   edx, BYTE PTR [rdi+4]
        sal     rdx, 24
        add     rdx, rax
        movzx   eax, BYTE PTR [rdi+5]
        sal     rax, 16
        add     rdx, rax
        movzx   eax, BYTE PTR [rdi+6]
        sal     rax, 8
        add     rax, rdx
        movzx   edx, BYTE PTR [rdi+7]
        add     rax, rdx
        ret

Upvotes: 0

Brian Bi
Brian Bi

Reputation: 119457

No, you can't ignore it. The operand buffer[i] is of type unsigned char, which is probably promoted to int (and if not int, then unsigned int). If 56 is greater than or equal to the bit width of int, then the shift is UB.

You need to write static_cast<unsigned long long>(buffer[0]) << 56 and so on, so the operand will be at least 64 bits long before the shift.

Upvotes: 2

Sam Varshavchik
Sam Varshavchik

Reputation: 118445

When used in an expression, unsigned char values are promoted to ints. Attempting to shift an int by 56 bits will, obviously, not be very productive.

You have to be more explicit:

 unsigned long long number= ((unsigned long long)buffer[0] << 56);
    number|= ((unsigned long long)buffer[1] << 48);

... and so on. You have to force a conversion to unsigned long long before the shift operation takes place.

Upvotes: 1

Related Questions