Reputation: 78585
I need to do an arbitrary reorder of a 7 bit value (Yes I know I should be using a table) and am wondering if there are any bit hacks to do this.
Example:
// <b0, b1, b2, b3, b4, b5, b6> -> <b3, b2, b4, b1, b5, b0, b6>
// the naive way
out =
(0x020 & In) << 5 |
(0x008 & In) << 2 |
(0x040 & In) |
(0x012 & In) >> 1 |
(0x004 & In) >> 2 |
(0x001 & In) >> 3;
// 6 ANDs, 5 ORs, 5 shifts = 16 ops
edit: I was thinking of something along the lines of this
Just for kicks and because I was AFTK I'm trying a brute force search for solutions of the form:
((In * C1) >> C2) & 0x7f
No solutions found.
Upvotes: 3
Views: 1226
Reputation: 35540
Before you optimize, you should make sure your 'naive' way is doing what you intend. If I make your code into a function and run this loop:
for (b=0;b<7;b++)
{
i=1<<b;
printf("%d: %02x -> %02x\n", b, i, shuffle(i));
}
It produces this output, which contradicts the comments. In fact, it loses bits.
0: 01 -> 00
1: 02 -> 01
2: 04 -> 01
3: 08 -> 20
4: 10 -> 08
5: 20 -> 00
6: 40 -> 40
In order to get the shuffle you describe, I would code it like this:
// 0 1 2 3 4 5 6
//-> 3 2 4 1 5 0 6
(0x001 & In) << 3 |
(0x004 & In) << 2 |
(0x020 & In) |
(0x012 & In) << 1 |
(0x008 & In) >> 2 |
(0x020 & In) >> 5 ;
Upvotes: 0
Reputation: 4778
The first step seems to be to understand a mathematical solution and optimize that.
see here of bit hacks
Upvotes: 5
Reputation: 1985
Have a look at the compiler output of your "naive" code, it might surprise you. I once did something like that and the compiler (VC++2005) completely changed the values of all the ands and shifts for me to make them more efficient, eg I'm sure it would remove your "(0x001 & In) >> 3".
But yes, if the reshuffle is a fixed function then a table is probably best.
Update
For a laugh I looked at the compiler output from VC++ 2005....
First I tried a constant value for "In" but the compiler wasn't fooled one bit, it produced this code:
mov eax,469h
ie. it completely optimized it away.
So ... I tried a proper input and got this:
00401D4F mov eax,ecx
00401D51 and eax,20h
00401D54 shl eax,3
00401D57 mov edx,ecx
00401D59 and edx,8
00401D5C or eax,edx
00401D5E mov edx,ecx
00401D60 sar edx,1
00401D62 and edx,2
00401D65 or edx,ecx
00401D67 sar edx,1
00401D69 shl eax,2
00401D6C and edx,9
00401D6F or eax,edx
00401D71 and ecx,40h
00401D74 or eax,ecx
That's four shift operations, five ANDs, four ORs - not bad for six inputs. Probably better than most people could do by hand.
It's probably also optimized for out-of-order execution so it'll be less clock cycles than it seems. :-)
Upvotes: 4
Reputation: 339836
There are plenty of bit-twiddling hacks for common operations, i.e. to reverse the order of the bits in a 32-bit word (10 each of shift, AND and OR, AFAICR).
In this case, with an apparently completely arbitrary mapping from input to output, I can't see any way of cleaning this up.
Use a lookup table :)
Upvotes: 2