Manav
Manav

Reputation: 10324

Native Endians and Auto Conversion

so the following converts big endians to little ones

uint32_t ntoh32(uint32_t v)
{
    return (v << 24)
        | ((v & 0x0000ff00) << 8)
        | ((v & 0x00ff0000) >> 8)
        | (v >> 24);
}

works. like a charm.

I read 4 bytes from a big endian file into char v[4] and pass it into the above function as

 ntoh32 (* reinterpret_cast<uint32_t *> (v))

that doesn't work - because my compiler (VS 2005) automatically converts the big endian char[4] into a little endian uint32_t when I do the cast.

AFAIK, this automatic conversion will not be portable, so I use

uint32_t ntoh_4b(char v[])
{
    uint32_t a = 0;
    a |= (unsigned char)v[0];
    a <<= 8;
    a |= (unsigned char)v[1];
    a <<= 8;
    a |= (unsigned char)v[2];
    a <<= 8;
    a |= (unsigned char)v[3];
    return a;
}

yes the (unsigned char) is necessary. yes it is dog slow.

there must be a better way. anyone ?

Upvotes: 2

Views: 1092

Answers (3)

Manav
Manav

Reputation: 10324

(Posting this as a separate answer to preserve indentation)

The test rig...

    union {
        char a[4];
        uint32_t i;
    } t;
    t.i = 0xaabbccdd;

    uint32_t v;
    for (uint32_t i = 0; i < -1; ++i)
    {
        //v = ntohl (t.i); (1)
        //v = ntoh32 (t.i); (2)
        //v = ntoh_4b (t.a);  (3)
    }

the disassembly of ntoh32...

movl    %edi, -4(%rbp)
movl    -4(%rbp), %eax
movl    %eax, %edx
sall    $24, %edx
movl    -4(%rbp), %eax
andl    $65280, %eax
sall    $8, %eax
orl %eax, %edx
movl    -4(%rbp), %eax
andl    $16711680, %eax
shrl    $8, %eax
orl %eax, %edx
movl    -4(%rbp), %eax
shrl    $24, %eax
orl %edx, %eax
leave

the disassembly of ntoh_4b ...

movq    %rdi, -8(%rbp)
movq    -8(%rbp), %rax
movzbl  (%rax), %eax
movzbl  %al, %eax
movl    %eax, %edx
sall    $24, %edx
movq    -8(%rbp), %rax
addq    $1, %rax
movzbl  (%rax), %eax
movzbl  %al, %eax
sall    $16, %eax
orl %eax, %edx
movq    -8(%rbp), %rax
addq    $2, %rax
movzbl  (%rax), %eax
movzbl  %al, %eax
sall    $8, %eax
orl %eax, %edx
movq    -8(%rbp), %rax
addq    $3, %rax
movzbl  (%rax), %eax
movzbl  %al, %eax
orl %edx, %eax
leave

And finally, the results. I've included the time for the C library's ntohl to provide a baseline for comparision

//v = ntohl (t.i); (1)
real    0m35.030s
user    0m34.739s
sys 0m0.245s   

//v = ntoh32 (t.i); (2)
real    0m36.272s
user    0m36.070s
sys 0m0.115s

//v = ntoh_4b (t.a);  (3)
real    0m40.162s
user    0m40.013s
sys 0m0.097s

Upvotes: 0

Secure
Secure

Reputation: 4378

Dog slow? Did you actually measure it? You can rewrite it in the style of ntoh32 and significantly reduce the number of operations:

uint32_t ntoh_4b(char v[])
{
    return ( (uint32_t)(unsigned char)v[0] << 24 )
         | ( (uint32_t)(unsigned char)v[1] << 16 )
         | ( (uint32_t)(unsigned char)v[2] <<  8 )
         | ( (uint32_t)(unsigned char)v[3]       );
}

Upvotes: 0

Eli Bendersky
Eli Bendersky

Reputation: 273656

The better way, IMHO, is using the htonl and ntohl functions. If you want to be really portable you can not think in terms of "convert to little endian". Rather you should think about "convert to host endian". That's what ntohl is for, if your input is a big-endian for sure (which is what the network standard is).

Now, if you read your bytes individually, you can read them as an unsigned long (in binary mode) - this should give you a big-endian long, and then you can convert it to whatever you need - if you need host endian, then ntohl.

Upvotes: 2

Related Questions