Reputation: 151

Getting index of first occurrence in each row

I have this array full of Boolean values:

array([[[ True,  True, False, False, False, False],
        [ True, False, False, False, False, False]],

       [[False, False, True, False, True, False],
        [ True, False, False, False, False, False]],

       [[ True, False, False, False, False, False],
        [ True, False, False, False, False, False]]], dtype=bool)

I want to get indexes of first occurrences of True in each column in each row so the answer would be something like that:

array([[0,0,0],
      [0,1,0],
      [1,0,2],
      [1,1,0],
      [2,0,0],
      [2,1,0]])

Is there a simple and fast way of doing so?

Upvotes: 4

Answers (3)

Nick K

Reputation: 23

Adding onto other answers, if you want it to give you the index of the first col that's True, and return n where n = # cols in a if a row doesn't contain True:

first_occs = np.argmax(a, axis=1)
all_zeros = ~a.any(axis=1).astype(int)
first_occs_modified = first_occs + all_zeros * a.shape[1]

Upvotes: 0

Jaime

Reputation: 67417

Cannot test right now, but I think this should work

arr.argmax(axis=1).T

argmax on bools shortcircuits in numpy 1.9, so it should be preferred to where or nonzero for this use case.

EDIT OK, so the above solution doesn't work, but the approach with argmax is still useful:

In [23]: mult = np.product(arr.shape[:-1])

In [24]: np.column_stack(np.unravel_index(arr.shape[-1]*np.arange(mult) +
   ....:                                  arr.argmax(axis=-1).ravel(),
   ....:                                  arr.shape))
Out[24]:
array([[0, 0, 0],
       [0, 1, 0],
       [1, 0, 2],
       [1, 1, 0],
       [2, 0, 0],
       [2, 1, 0]])

Upvotes: 3

Saullo G. P. Castro

Reputation: 58865

It seems you want np.where() combined with the solution of this answer to find unique rows:

b = np.array(np.where(a)).T
#array([[0, 0, 0],
#       [0, 0, 1],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 0, 4],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)
c = b[:,:2]
d = np.ascontiguousarray(c).view(np.dtype((np.void, c.dtype.itemsize * c.shape[1])))
_, idx = np.unique(d, return_index=True)

b[idx]
#array([[0, 0, 0],
#       [0, 1, 0],
#       [1, 0, 2],
#       [1, 1, 0],
#       [2, 0, 0],
#       [2, 1, 0]], dtype=int64)

Upvotes: 1

Getting index of first occurrence in each row

Answers (3)

Related Questions