Reputation: 303
I have a binary numpy masked array and I want to find the index of the elements along the axis=0 when there is at least 3 consecutive occurrence of 1. If no occurrence then -999 or NaN or anything that shows it is not an index. So for example my array is like:
masked_array(
data=[[[1.0, 0.0],
[0.0, 1.0]],
[[0.0, 1.0],
[0.0, 1.0]],
[[1.0, 1.0],
[1.0, 1.0]],
[[1.0, 1.0],
[1.0, 0.0]],
[[1.0, --],
[0.0, 1.0]],
[[1.0, 1.0],
[1.0, 1.0]]])
and I want to get something like this:
array([[ 2, 1],
[-999, 0]])
What is the most pythonic way of doing that? Any hint would be really appreciated.
Upvotes: 3
Views: 287
Reputation: 59274
IIUC, you can first make your np array 2D and build a data frame, which makes everything easier. Take a look
row, cols = m.shape[0], m.shape[1] * m.shape[2]
df = pd.DataFrame(m.reshape(row, cols))
0 1 2 3
0 1.0 0.0 0.0 1.0
1 0.0 1.0 0.0 1.0
2 1.0 1.0 1.0 1.0
3 1.0 1.0 1.0 0.0
4 1.0 0.0 0.0 1.0
5 1.0 1.0 1.0 1.0
Now you can use a reverse rolling
window of 3
on axis=0
and check if all
elements are 1
ndf = df[::-1].rolling(3, axis=0).apply(all, raw=True)[::-1]
0 1 2 3
0 NaN NaN NaN 1.0
1 NaN 1.0 NaN NaN
2 1.0 NaN NaN NaN
3 1.0 NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
And use idxmax()
to get the index of the first 1
occurence
ndf[ndf>=1].idxmax()
0 2.0
1 1.0
2 NaN
3 0.0
dtype: float
To visualize the way you described, just reshape the output
ndf[ndf>=1].idxmax().values.reshape(m.shape[1], m.shape[2])
array([[ 2., 1.],
[nan, 0.]])
Upvotes: 3