Reputation: 4305
I am needing to figure out a bit-hack to find the first location where the bit-value changes. It could be from 0
to 1
or 1
to 0
.
// This would give 0 since no changes happen
let x1 = 0b00000000
// This also gives 0 since there is no change
let x2 = 0b11111111
// This would give 1 since the value changes after the first bit
let x3 = 0b10000000
// This would also give 1 since the value change
// happens at the first bit
let x4 = 0b01111111
// This would return 7 since the change is at the 7th bit
let x5 = 0b00000001
// This would also return 7
let x6 = 0b11111110
Any recommendations on an incantation that would give these results?
Upvotes: 3
Views: 225
Reputation: 73
Like it has been said, you should first NOT it if you have leading ones. I only knew the cdq method for this. I learned new commands from that answer. But if you want to implement the bit hacks to do them or you do not have the commands you can try this for the leading zero count:
http://aggregate.org/MAGIC/#Leading%20Zero%20Count
it also uses the ones count:
http://aggregate.org/MAGIC/#Population%20Count%20(Ones%20Count)
there are many hacks for that, you can find also at:
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive
If you have a number that changes multiple times to 0 and 1 and you want to search it starting from least significant bit, get the most significant bit:
http://aggregate.org/MAGIC/#Most%20Significant%201%20Bit
and then do trailing zero count:
http://aggregate.org/MAGIC/#Trailing%20Zero%20Count
another trailing zero count method I learned yesterday is by De Bruijn sequence and table lookup, you can find that here:
https://www.youtube.com/watch?v=ZusiKXcz_ac&t=3619s
at about minute 43.
Upvotes: 1
Reputation: 364428
The key building-block is a bit-scan like bsr
. (Or 31-lzcnt
).
That finds the position of the highest 1
bit, and we can transform your input to work for this.
If the leading bit is set, NOT it before bit-scan. Like x = ((int32_t)x<0) ? ~x : x;
with mov ecx, eax
/ xor eax, -1
(like NOT but sets FLAGS) / cmovns ecx, eax
. So like an absolute-value idiom, but with NOT instead of NEG . Another option is sar reg,31
(or cdq
) / xor
with that to flip or not.
#include <bit> // C++20
#include <stdint.h>
int transition_position(uint32_t x)
{
x = ((int32_t)x<0) ? ~x : x; // make the leading bits zero
if (x == 0) __builtin_unreachable(); // optional, x|=1 is another way to handle the case where all bits are the same, allowing BSF without branching
return std::countl_zero(x) ^ 31; // 31-lzcnt in a way that GCC can optimize back to bsf
}
This compiles nicely with GCC / clang: https://godbolt.org/z/63nEcnTso
gcc11.2 and clang13 -O3 both make this asm
transition_position(unsigned int):
mov eax, edi
sar eax, 31 # missed optimization: should shift EDI so mov is off the critical path on CPUs without mov-elimination
xor eax, edi
bsr eax, eax
ret
With -march=haswell
, they both de-optimize by using lzcnt
and then an actual xor eax,31
. (That asm is still better for AMD, where bsf
is significantly slower than tzcnt
/lzcnt
, so it's good for -mtune=generic -mbmi
, but not for -march=haswell
)
Upvotes: 3