Anotomica
Anotomica

Reputation: 3

How to count the number of blocks of contiguous recurring ones/zeros in a binary string in Python

So for example, I want the code to return '5' for the input '01000110', since the blocks of recurring digits are '0', '1', '000', '11', '0'. I can't come up with a way to tackle this problem. All help/comments are appreciated.

Upvotes: 0

Views: 824

Answers (3)

John Coleman
John Coleman

Reputation: 51998

The function groupbyin the itertools module provides for a natural solution:

>>> len(list(itertools.groupby('01000110')))
5

As @chrisz points outs, you could make this marginally faster by replacing len(list()) by sum().

Upvotes: 2

user3483203
user3483203

Reputation: 51155

You could use regular expressions.

(0+|1+) will match any continuous regions are 1 or 0, and then you can check the length of the resulting array.

import re

s = '01000110' 
print(len(re.findall(r'(0+|1+)', s)))    # ['0', '1', '000', '11', '0']

Output:

5

As @John Coleman pointed out, you could also use itertools, which would be marginally faster on large binary strings:

len(list(itertools.groupby(s)))

Timings:

In [18]: x = np.random.randint(2, size=100000)

In [19]: x = ''.join(map(str, x))

In [20]: %timeit len(re.findall(r'(0+|1+)', x))
10.9 ms ± 327 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [21]: %timeit len(list(itertools.groupby(x)))
9.42 ms ± 173 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [22]: %timeit sum(1 for i in itertools.groupby(x))
9.12 ms ± 156 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 6

HitLuca
HitLuca

Reputation: 1260

given a list of 0s and 1s with length l

array = numpy.random.randint(0, 2, (l))

the number of consecutive regions are given by this code (not optimized, just to show the concept)

count = 1
current_digit = array[0]
for digit in array:
    if digit != current_digit:
        count += 1
        current_digit = digit
count

example, with

array = [0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0]

the number of regions is

count = 9

Upvotes: 0

Related Questions