einpoklum
einpoklum

Reputation: 132118

What's the fastest way to switch endianness when reading from a file with C++?

I've been provided a binary file to read, which holds a sequence of raw values. For the sake of simplicity suppose they're unsigned integral values, either 4-byte or 8-byte long. Unfortunately for me, the byte order for these values is incompatible with my processor's endianness (little vs big or vice-versa; never mind about weird PDF-endianness etc.); and I want this data in memory with the proper endianness.

What's the fastest way to do this, considering the fact that I'm reading the data from a file? If it's not worth exploiting this fact, please explain why that is.

Upvotes: 0

Views: 3327

Answers (2)

eerorika
eerorika

Reputation: 238421

Considering the fact that you're reading the data from a file, the way you switch endianness is going to have insignificant effect on the runtime, compared to what the file-IO does.

What could make a significant difference is how you read the data. Trying to read the bytes out of order would not be a good idea. Simply read the bytes in order, and switch endianness afterwards. This separates the reading and the byte swapping.

What I want from the byte swapping code typically, and certainly in a case of reading a file, is that it works for any endianness and doesn't depend on architechture specific instructions.

char* buf = read(); // let buf be a pointer to the read buffer
uint32_t v;

// little to native
v = 0;
for(unsigned i = 0; i < sizeof v; i++)
    v |= buf[i] << CHAR_BIT * i;

// big to native
v = 0;
for(unsigned i = 0; i < sizeof v; i++)
    v |= buf[i] << CHAR_BIT * (sizeof v - i);

This works whether the native is big, little, or one of the middle endian variety.

Of course, boost has already implemented these for you, so there is no need to re-implement. Also, there are the ntoh? family of functions provided by both POSIX and by the windows C library, which can be used to convert big endian to/from native.

Upvotes: 2

Serge Ballesta
Serge Ballesta

Reputation: 149135

Not the fastest, but a portable way would be to read the file into an (unsigned) int array, alias the int array to a char one (allowed per strict aliasing rule) and swap bytes in memory.

Fully portable way:

swapints(unsigned int *arr, size_t l) {
    unsigned int cur;
    char *ix;
    for (size_t i=0; i<l; i++) {
        int cur;
        char *dest = static_cast<char *>(&cur) + sizeof(int);
        char *src = static_cast<char *>(&(arr[i]));
        for(int j=0; j<sizeof(int); j++) *(--dest) = *(src++);
        arr[i] = cur;
    }
}

But if you do not need portability, some systems offer swapping functions. For example BSD systems have bswap16, bswap32 and bswap64 to swap byte in uint16_t, uint32_t and uint_64_t respectively. No doubt equivalent functions exist in Microsoft or GNU-Linux worlds.

Alternatively, if you know that the file is in network order (big endian) and your processor is not, you can use the ntohs and ntohl functions for respectively uint16_t and uint32_t.

Remark (per AndrewHenle's comment): whatever the host endianness, ntohs and ntohl can always be used - simply they are no-ops on big-endian systems

Upvotes: 1

Related Questions