Standard compliant host to network endianess conversion

Question

I am amazed at how many topics on StackOverflow deal with finding out the endianess of the system and converting endianess. I am even more amazed that there are hundreds of different answers to these two questions. All proposed solutions that I have seen so far are based on undefined behaviour, non-standard compiler extensions or OS-specific header files. In my opinion, this question is only a duplicate if an existing answer gives a standard-compliant, efficient (e.g., use x86-bswap), compile time-enabled solution.

Surely there must be a standard-compliant solution available that I am unable to find in the huge mess of old "hacky" ones. It is also somewhat strange that the standard library does not include such a function. Perhaps the attitude towards such issues is changing, since C++20 introduced a way to detect endianess into the standard (via std::endian), and C++23 will probably include std::byteswap, which flips endianess.

In any case, my questions are these:

Starting at what C++ standard is there a portable standard-compliant way of performing host to network byte order conversion?
I argue below that it's possible in C++20. Is my code correct and can it be improved?
Should such a pure-c++ solution be preferred to OS specific functions such as, e.g., POSIX-htonl? (I think yes)

I think I can give a C++23 solution that is OS-independent, efficient (no system call, uses x86-bswap) and portable to little-endian and big-endian systems (but not portable to mixed-endian systems):

// requires C++23. see https://gcc.godbolt.org/z/6or1sEvKn
#include 
#include 
#include 

constexpr inline auto host_to_net(std::integral auto i) {
    static_assert(std::endian::native == std::endian::big || std::endian::native == std::endian::little);
    if constexpr (std::endian::native == std::endian::big) {
        return i;
    } else {
        return std::byteswap(i);
    }
}

Since std::endian is available in C++20, one can give a C++20 solution for host_to_net by implementing byteswap manually. A solution is described here, quote:

// requires C++17
#include 
#include 
#include 

template
constexpr T bswap_impl(T i, std::index_sequence) {
  return ((((i >> (N * CHAR_BIT)) & (T)(unsigned char)(-1)) <<
           ((sizeof(T) - 1 - N) * CHAR_BIT)) | ...);
}; //                                        ^~~~~ fold expression
template::type>
constexpr U bswap(T i) {
  return bswap_impl(i, std::make_index_sequence{});
}

The linked answer also provides a C++11 byteswap, but that one seems to be less efficient (not compiled to x86-bswap). I think there should be an efficient C++11 way of doing this, too (using either less template-nonsense or even more) but I don't care about older C++ and didn't really try.

Assuming I am correct, the remaining question is: can one can determine system endianess before C++20 at compile time in a standard-compliant and compiler-agnostic way? None of the answers here seem to do achieve this. They use reinterpret_cast (not compile time), OS-headers, union aliasing (which I believe is UB in C++), etc. Also, for some reason, they try to do it "at runtime" although a compiled executable will always run under the same endianess.)

One could do it outside of constexpr context and hope it's optimized away. On the other hand, one could use system-defined preprocessor definitions and account for all platforms, as seems to be the approach taken by Boost. Or maybe (although I would guess the other way is better?) use macros and pick platform-specific htnl-style functions from networking libraries(done, e.g., here (GitHub))?

eerorika · Accepted Answer

compile time-enabled solution.

Consider whether this is useful requirement in the first place. The program isn't going to be communicating with another system at compile time. What is the case where you would need to use the serialised integer in a compile time constant context?

Starting at what C++ standard is there a portable standard-compliant way of performing host to network byte order conversion?

It's possible to write such function in standard C++ since C++98. That said, later standards bring tasty template goodies that make this nicer.

There isn't such function in the standard library as of the latest standard.

Should such a pure-c++ solution be preferred to OS specific functions such as, e.g., POSIX-htonl? (I think yes)

Advantage of POSIX is that it's less important to write tests to make sure that it works correctly.

Advantage of pure C++ function is that you don't need platform specific alternatives to those that don't conform to POSIX.

Also, the POSIX htonX are only for 16 bit and 32 bit integers. You could instead use htobeXX functions instead that are in some *BSD and in Linux (glibc).

Here is what I have been using since C+17. Some notes beforehand:

Since endianness conversion is always¹ for purposes of serialisation, I write the result directly into a buffer. When converting to host endianness, I read from a buffer.
I don't use CHAR_BIT because network doesn't know my byte size anyway. Network byte is an octet, and if your CPU is different, then these functions won't work. Correct handling of non-octet byte is possible but unnecessary work unless you need to support network communication on such system. Adding an assert might be a good idea.
I prefer to call it big endian rather than "network" endian. There's a chance that a reader isn't aware of the convention that de-facto endianness of network is big.
Instead of checking "if native endianness is X, do Y else do Z", I prefer to write a function that works with all native endianness. This can be done with bit shifts.

Yeah, it's constexpr. Not because it needs to be, but just because it can be. I haven't been able to produce an example where dropping constexpr would produce worse code.

// helper to promote an integer type
template 
using promote_t = std::decay_t())>;

template 
constexpr void
host_to_big_impl(
    unsigned char* buf,
    T t,
    [[maybe_unused]] std::index_sequence) noexcept
{
    using U = std::make_unsigned_t>;
    constexpr U lastI = sizeof(T) - 1u;
    constexpr U bits = 8u;
    U u = t;
    ( (buf[I] = u >> ((lastI - I) * bits)), ... );
}


template 
constexpr void
host_to_big(unsigned char* buf, T t) noexcept
{
    using Indices = std::make_index_sequence;
    return host_to_big_impl(buf, t, Indices{});
}

_{¹ In all use cases I've encountered. Conversions from integer to integer can be implemented by delegating these if you have such case, although they cannot be constexpr due to need for reinterpret_cast.}

Standard compliant host to network endianess conversion

Answers (2)

Related Questions