szvarga
szvarga

Reputation: 3

Finding binary representation of a number at any bit position

in the past week I struggled with this problem, and it seems I can't handle it finally. Given an arbitrary 64bit unsigned int number, if it contains the binary pattern of 31 (0b11111) at any bit position, at any bit settings, the number is valid, otherwise not.

E.g.:

0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 1111 valid

0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid

0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0111 1100 valid

0000 0000 0000 0000 0000 0000 0000 0000 0000 1111 1000 0000 0000 0000 0000 0000 valid

0000 0000 0011 1110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid etc...

also:

0000 0000 0000 1100 0000 0100 0000 0000 1100 1100 0000 0000 0100 0000 0001 1111 valid

1110 0000 0000 0100 0000 0000 0011 0000 0000 0000 0000 0000 0000 0000 0011 1110 valid

0000 0000 1000 0010 0000 0010 0000 0000 0000 0000 0010 0000 0000 0000 0111 1100 valid

0000 0010 0000 0110 0000 0000 0000 0000 0000 1111 1000 0000 0000 0100 0110 0000 valid

0000 0000 0011 1110 0000 0000 0011 0000 0000 1000 0000 0000 0000 0000 0011 1110 valid etc...

but:

0000 0000 0000 1100 0000 0100 0000 0000 1100 1100 0000 0000 0100 0000 0000 1111 invalid

1110 0000 0000 0100 0000 0000 0011 0000 0000 0000 0000 0000 0000 0000 0011 1100 invalid

0000 0000 1000 0010 0000 0010 0000 0000 0000 0000 0010 0000 0000 0000 0101 1100 invalid

0000 0010 0000 0110 0000 0000 0000 0000 0000 1111 0000 0000 0000 0100 0110 0000 invalid

0000 0000 0011 1010 0000 0000 0011 0000 0000 1000 0000 0000 0000 0000 0001 1110 invalid etc...

You've got the point...

But that is just the first half of the problem. The second one is, it needs to be implemented without loops or branches (which was done already) for speed increasing, using only one check by arithmetic and/or logical, bit manipulation kind of code.

The closest I can get, is a modified version of Bit Twiddling Hacks "Determine if a word has a zero byte" ( https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord ) to check five bit blocks of zeros (negated 11111). But it still has the limit of the ability to check only fixed blocks of bits (bit 0 to 4, bit 5 to 9, etc...) not at any bit position (as in the examples above).

Any help would be greatly appreciated, since I'm totally exhausted.

Sz

Upvotes: 0

Views: 158

Answers (1)

Socowi
Socowi

Reputation: 27340

Implementation

Let me restate your goal in a slightly different formulation:

I want to check whether an integer contains 5 consecutive high bits.

From this formulation the following solution explains itself. It is written in C++.

bool contains11111(uint64_t i) {
   return i & (i << 1) & (i << 2) & (i << 3) & (i << 4);
}

This approach also works for any other pattern. For instance, if you wanted to check for 010 you would use ~i & (i<<1) & ~(i<<2). In your example the pattern's length is a prime number, but for composite numbers and especially powers of two you can optimize this even further. For instance, when searching for 1111 1111 you could use i&=i<<1; i&=i<<2; i&=i<<4; return i.

Testing

To test this on your examples I used the following program. The literals inside testcases[] were generated by running your examples through the bash command ...

{ echo 'ibase=2'; tr -dc '01\n' < fileWithYourExamples; } |
bc | sed 's/.*/UINT64_C(&),/'

#include <cinttypes>
#include <cstdio>

bool contains11111(uint64_t i) {
  return i & (i << 1) & (i << 2) & (i << 3) & (i << 4);
}

int main() {
  uint64_t testcases[] = {
      // valid
      UINT64_C(31),
      UINT64_C(62),
      UINT64_C(124),
      UINT64_C(260046848),
      UINT64_C(17451448556060734),
      UINT64_C(3382101189607455),
      UINT64_C(16142027170561130558),
      UINT64_C(36593945997738108),
      UINT64_C(145804038196167776),
      UINT64_C(17451654848708670),
      // invalid
      UINT64_C(3382101189607439),
      UINT64_C(16142027170561130556),
      UINT64_C(36593945997738076),
      UINT64_C(145804038187779168),
      UINT64_C(16325754941866014),
  };
  for (uint64_t i : testcases) {
    std::printf("%d <- %016" PRIx64 "\n", contains11111(i), i);
  }
}

This prints

1 <- 000000000000001f
1 <- 000000000000003e
1 <- 000000000000007c
1 <- 000000000f800000
1 <- 003e00000000003e
1 <- 000c0400cc00401f
1 <- e00400300000003e
1 <- 008202000020007c
1 <- 020600000f800460
1 <- 003e00300800003e
0 <- 000c0400cc00400f
0 <- e00400300000003c
0 <- 008202000020005c
0 <- 020600000f000460
0 <- 003a00300800001e

Upvotes: 2

Related Questions