Reputation: 3
So for example, I want the code to return '5' for the input '01000110', since the blocks of recurring digits are '0', '1', '000', '11', '0'. I can't come up with a way to tackle this problem. All help/comments are appreciated.
Upvotes: 0
Views: 824
Reputation: 51998
The function groupby
in the itertools module provides for a natural solution:
>>> len(list(itertools.groupby('01000110')))
5
As @chrisz points outs, you could make this marginally faster by replacing len(list())
by sum()
.
Upvotes: 2
Reputation: 51155
You could use regular expressions.
(0+|1+)
will match any continuous regions are 1 or 0, and then you can check the length of the resulting array.
import re
s = '01000110'
print(len(re.findall(r'(0+|1+)', s))) # ['0', '1', '000', '11', '0']
Output:
5
As @John Coleman pointed out, you could also use itertools, which would be marginally faster on large binary strings:
len(list(itertools.groupby(s)))
Timings:
In [18]: x = np.random.randint(2, size=100000)
In [19]: x = ''.join(map(str, x))
In [20]: %timeit len(re.findall(r'(0+|1+)', x))
10.9 ms ± 327 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [21]: %timeit len(list(itertools.groupby(x)))
9.42 ms ± 173 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [22]: %timeit sum(1 for i in itertools.groupby(x))
9.12 ms ± 156 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Upvotes: 6
Reputation: 1260
given a list of 0s and 1s with length l
array = numpy.random.randint(0, 2, (l))
the number of consecutive regions are given by this code (not optimized, just to show the concept)
count = 1
current_digit = array[0]
for digit in array:
if digit != current_digit:
count += 1
current_digit = digit
count
example, with
array = [0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0]
the number of regions is
count = 9
Upvotes: 0