First occurrence of consecutive elements in a numpy array

Question

I have a binary numpy masked array and I want to find the index of the elements along the axis=0 when there is at least 3 consecutive occurrence of 1. If no occurrence then -999 or NaN or anything that shows it is not an index. So for example my array is like:

masked_array(
data=[[[1.0, 0.0],
     [0.0, 1.0]],

    [[0.0, 1.0],
     [0.0, 1.0]],

    [[1.0, 1.0],
     [1.0, 1.0]],

    [[1.0, 1.0],
     [1.0, 0.0]],

    [[1.0, --],
     [0.0, 1.0]],

    [[1.0, 1.0],
     [1.0, 1.0]]])

and I want to get something like this:

array([[   2,    1],
       [-999,    0]])

What is the most pythonic way of doing that? Any hint would be really appreciated.

rafaelc · Accepted Answer

IIUC, you can first make your np array 2D and build a data frame, which makes everything easier. Take a look

row, cols = m.shape[0], m.shape[1] * m.shape[2]
df = pd.DataFrame(m.reshape(row, cols))

    0   1   2   3
0   1.0 0.0 0.0 1.0
1   0.0 1.0 0.0 1.0
2   1.0 1.0 1.0 1.0
3   1.0 1.0 1.0 0.0
4   1.0 0.0 0.0 1.0
5   1.0 1.0 1.0 1.0

Now you can use a reverse rolling window of 3 on axis=0 and check if all elements are 1

ndf = df[::-1].rolling(3, axis=0).apply(all, raw=True)[::-1]

    0   1   2   3
0   NaN NaN NaN 1.0
1   NaN 1.0 NaN NaN
2   1.0 NaN NaN NaN
3   1.0 NaN NaN NaN
4   NaN NaN NaN NaN
5   NaN NaN NaN NaN

And use idxmax() to get the index of the first 1 occurence

ndf[ndf>=1].idxmax()

0    2.0
1    1.0
2    NaN
3    0.0
dtype: float

To visualize the way you described, just reshape the output

ndf[ndf>=1].idxmax().values.reshape(m.shape[1], m.shape[2])

array([[ 2.,  1.],
       [nan,  0.]])

First occurrence of consecutive elements in a numpy array

Answers (1)

Related Questions