user1938027
user1938027

Reputation: 151

Getting index of first occurrence in each row

I have this array full of Boolean values:

array([[[ True,  True, False, False, False, False],
        [ True, False, False, False, False, False]],

       [[False, False, True, False, True, False],
        [ True, False, False, False, False, False]],

       [[ True, False, False, False, False, False],
        [ True, False, False, False, False, False]]], dtype=bool)

I want to get indexes of first occurrences of True in each column in each row so the answer would be something like that:

array([[0,0,0],
      [0,1,0],
      [1,0,2],
      [1,1,0],
      [2,0,0],
      [2,1,0]])

Is there a simple and fast way of doing so?

Upvotes: 4

Views: 2162

Answers (3)

Nick K
Nick K

Reputation: 23

Adding onto other answers, if you want it to give you the index of the first col that's True, and return n where n = # cols in a if a row doesn't contain True:

first_occs = np.argmax(a, axis=1)
all_zeros = ~a.any(axis=1).astype(int)
first_occs_modified = first_occs + all_zeros * a.shape[1]

Upvotes: 0

Jaime
Jaime

Reputation: 67417

Cannot test right now, but I think this should work

arr.argmax(axis=1).T

argmax on bools shortcircuits in numpy 1.9, so it should be preferred to where or nonzero for this use case.


EDIT OK, so the above solution doesn't work, but the approach with argmax is still useful:

In [23]: mult = np.product(arr.shape[:-1])

In [24]: np.column_stack(np.unravel_index(arr.shape[-1]*np.arange(mult) +
   ....:                                  arr.argmax(axis=-1).ravel(),
   ....:                                  arr.shape))
Out[24]:
array([[0, 0, 0],
       [0, 1, 0],
       [1, 0, 2],
       [1, 1, 0],
       [2, 0, 0],
       [2, 1, 0]])

Upvotes: 3

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58865

It seems you want np.where() combined with the solution of this answer to find unique rows:

b = np.array(np.where(a)).T
#array([[0, 0, 0],
#       [0, 0, 1],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 0, 4],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)
c = b[:,:2]
d = np.ascontiguousarray(c).view(np.dtype((np.void, c.dtype.itemsize * c.shape[1])))
_, idx = np.unique(d, return_index=True)

b[idx]
#array([[0, 0, 0],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)

Upvotes: 1

Related Questions