user3821012
user3821012

Reputation: 1331

numpy: detect consecutive 1 in an array

I want to detect consecutive spans of 1's in a numpy array. Indeed, I want to first identify whether the element in an array is in a span of a least three 1's. For example, we have the following array a:

    import numpy as np
    a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

Then the following 1's in bold are the elements satisfy the requirement.

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

Next, if two spans of 1's are separated by at most two 0's, then the two spans make up a longer span. So the above array is charaterized as

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

In other words, for the original array as input, I want the output as follows:

    [True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]

I have been thinking of an algorithm to implement this function, but all the one I come up with seems to complicated. So I would love to know better ways to implement this -- it would be greatly appreciated if someone can help me out.

Update:

I apologize that I did not make my question clear. I want to identify 3 or more consecutive 1's in the array as a span of 1's, and any two spans of 1's with only one or two 0's in between are identified, along with the separating 0's, as a single long span. My goal can be understood in the following way: if there are only one or two 0's between spans of 1's, I consider those 0's as errors and are supposed to be corrected as 1's.

@ritesht93 provided an answer that almost gives what I want. However, the current answer does not identify the case when there are three spans of 1's that are separated by 0's, which should be identified as one single span. For example, for the array

    a2 = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])

we should receive the output

    [False,  True,  True,  True,  True,  True,  True,  True,  True,
   True,  True,  True,  True,  True, False, False,  False, False,
   False,  True,  True,  True,  True,  True, False]

Update 2:

I was greatly inspired by and found the algorithm based on regular expression is easiest to implement and to understand -- though I am not sure about the efficient compared to other methods. Eventually I used the following method.

    lst = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
    lst1 = re.sub(r'1{3,}', lambda x:'c'*len(x.group()), ''.join(map(str, lst)))
    print lst1

which identified spans of 1's

    0ccc0ccc00cccc00100ccccc0

and then connect spans of 1's

    lst2 = re.sub(r'c{1}0{1,2}c{1}', lambda x:'c'*len(x.group()), ''.join(map(str, lst1)))
    print lst2

which gives

    0ccccccccccccc00100ccccc0

The final result is given by

    np.array(list(lst2)) == 'c'

    array([False,  True,  True,  True,  True,  True,  True,  True,  True,
    True,  True,  True,  True,  True, False, False, False, False,
   False,  True,  True,  True,  True,  True, False])

Upvotes: 5

Views: 1616

Answers (4)

duck
duck

Reputation: 379

i know this is not so "python-wise", but since you talks about algorithm, i decided to give it a try (sorry i am not so familiar with python)

import numpy as np
a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])
b = np.array([int])

#init 2nd array 
for x in range (0,(a.size-1)):
    b = np.append(b,0)

print (b)
#1st case
for x in range (2,(a.size)):
    if (a[x-2]==1 & a[x-1]==1 & a[x]==1): #1-1-1
        b[x] = 1
        b[x-1] = 1
        b[x-2] = 1

print (b)
#2nd case
for x in range (2,(b.size)):
    if (b[x-2]==1 & b[x]==1): #1-0-1
        if (b[x-1]==0): #sorry, i forget about logical op. in python
            b[x-1] = 1

print (b)
#3rd case
for x in range (3,(b.size)):
    if (b[x-3]==1 & b[x]==1): #1-0-0-1
        if (b[x-2]==0 & b[x]-1==0):
            b[x-1] = 1
            b[x-2] = 1

#4th case
for x in range (4,(b.size)):
    if (a[x-4]==1 & a[x-3]==1 & b[x]): #1-1-0-0-1
        if (a[x-2]==0 & a[x]-1==0):
            b[x-3] = 1
            b[x-4] = 1
print (b)

I am not sure if this is exactly your expected result, but here it is:
[1 1 1 1 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0]

Upvotes: 1

riteshtch
riteshtch

Reputation: 8769

Instead of solving the conventional way of looping and maintaining count we can convert all 0's and 1's to a single string and replace a regex match with another char say 2. Once that is done we split out the string again and check for bool() over each char.

>>> import re
>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> list(map(bool, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, True, True, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>> 

All the operations happen over here:

re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))

where it searches for a contiguous occurrence of 3 or more 1's followed by at most 2 0's i.e 1 or 2 0's followed by 3 or more 1's and replaces the whole matched string with same length string of 2's (used 2 because bool(2) is True). Also you can use tolist() method in NumPy to get a list out of NumPy array like this: np.array([1,2, 3, 4, 5, 6]).tolist()

Edit 1 : after a change in question, here is the updated answer:

>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> import re
>>> list(map(lambda x:False if x == 0 or x ==1 else True, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>> 

Edit 2 FINAL ANSWER:

>>> import re
>>> lst=[0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> while re.subn(r'[12]{3,}0{1,2}[12]{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))[1]:
...     lst=re.subn(r'[12]{3,}0{1,2}[12]{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))[0]
... 
>>> lst
'0222222222222200100111110'
>>> lst=list(re.sub(r'1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst))))
>>> lst
['0', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '0', '0', '1', '0', '0', '2', '2', '2', '2', '2', '0']
>>> list(map(lambda x:False if x == 0 or x ==1 else True, map(int, lst)))
[False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, False]
>>> 

Upvotes: 1

Divakar
Divakar

Reputation: 221524

We could solve it with a combination of binary dilation and erosion to get past the first stage and then binary closing to get the final output, like so -

from scipy.ndimage.morphology import binary_erosion,binary_dilation,binary_closing

K = np.ones(3,dtype=int) # Kernel
b = binary_dilation(binary_erosion(a,K),K)
out = binary_closing(b,K) | b

Sample runs

Case #1 :

In [454]: a
Out[454]: array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

In [456]: out
Out[456]: 
array([ True,  True,  True,  True,  True,  True,  True, False, False,
       False, False, False,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False], dtype=bool)

Case #2 :

In [460]: a
Out[460]: 
array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])

In [461]: out
Out[461]: 
array([False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True, False, False, False, False,
       False,  True,  True,  True,  True,  True, False], dtype=bool)

Upvotes: 2

hvwaldow
hvwaldow

Reputation: 1376

Many ways to to it. I would split this into grouping, apply conditions to groups and flattening operations. Like so:

from itertools import groupby, starmap
import numpy as np

a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])

def condition(groups, key, newkey, minlen):
    return [(newkey, l) if l < minlen and k == key else (k, l) for k, l in groups]

def flatten(groups):
    return [k for g in starmap(lambda k, l: l * [k], groups) for k in g]

def group(l):
    return [(k, len(list(v))) for k, v in groupby(l)]

res = group(flatten(condition(group(a), 1, 0, 3)))
# groups zeros at the beginning or the end never change to ones,
# no matter their length
res = flatten([res[0]] + condition(res[1:-1], 0, 1, 3) + [res[-1]])
print [bool(v) for v in res]

Upvotes: 1

Related Questions