Reputation: 523
In an interview, I was asked to implement big_to_little_endian() as a macro. I implemented using shift operator. But the interviewer want me to optimize this further. I could not do it. Later I googled & searched but could not find it. Can someone help in understanding how to further optimize this code?
#define be_to_le (((x) >> 24) | (((x) & 0x00FF0000) >> 8) | (((x) & 0x0000FF00) << 8) | ((x) << 24))
Upvotes: 3
Views: 496
Reputation: 1556
He might have been referring to using a 16-bit op to swap the top two words then using 8-bit ops to swap the bytes in them -- saves a couple instructions, easiest done in a union, though C technically doesn't like it (but many compilers will accept it), and it still compiler dependent since you are hoping the compiler optimizes a couple things out:
union dword {
unsigned int i;
union shorts {
unsigned short s0, s1;
union bytes {
unsigned char c0, c1, c2, c3;
} c;
} s;
};
union dword in = (union dword)x;
union dword temp = { x.s.s1, x.s.s0 };
union dword out = { temp.s.c.c1, temp.s.c.c0, temp.s.c.c3, temp.s.c.c2 };
Not even valid C, but you get the idea (and I don't think the compiler will even emit what I'm hoping it will).
Or you can save an op, but introduce a data dependency so probably runs slower.
temp = (x << 16) | ( x >> 16)
out = ((0xff00ff00 & temp) >> 8) | (0x00ff00ff & temp) << 8)
Best is just use the compiler intrinsic since it maps to a single bswap
instruction.
Upvotes: 3