myradio
myradio

Reputation: 1775

Pythonicaly get a subset of rows from a numpy matrix based on a condition on each row and all columns

Given the following matrix,

In [0]: a = np.array([[1,2,9,4,2,5],[4,5,1,4,2,4],[2,3,6,7,8,9],[5,6,7,4,3,6]])
Out[0]: 
array([[1, 2, 9, 4, 2, 5],
       [4, 5, 1, 4, 2, 4],
       [2, 3, 6, 7, 8, 9],
       [5, 6, 7, 4, 3, 6]])

I want to get the indices of the rows that have 9 as a member. This is,

idx = [0,2]

Currently I am doing this,

def myf(x):
    if any(x==9):
        return True
    else:
        return False

aux = np.apply_along_axis(myf, axis=1, arr=a)
idx = np.where(aux)[0]

And I get the result I wanted.

In [1]: idx
Out[1]: array([0, 2], dtype=int64)

But this method is very slow (meaning maybe there is a faster way) and certainly not very pythonic.

How can I do this in a cleaner, more pythonic but mainly more efficient way?

Note that this question is close to this one but here I want to apply the condition on the entire row.

Upvotes: 0

Views: 954

Answers (3)

Andy L.
Andy L.

Reputation: 25239

You may try np.nonzero and unique

Check on 9

np.unique((a == 9).nonzero()[0])

Out[356]: array([0, 2], dtype=int64)

Check on 6

np.unique((a == 6).nonzero()[0])

Out[358]: array([2, 3], dtype=int64)

Check on 8

np.unique((a == 8).nonzero()[0])

Out[359]: array([2], dtype=int64)

On non-existent number, return empty list

np.unique((a == 88).nonzero()[0])

Out[360]: array([], dtype=int64)

Upvotes: 0

Alain T.
Alain T.

Reputation: 42143

You could combine np.argwhere and np.any:

np.argwhere(np.any(a==9,axis=1))[:,0]

Upvotes: 1

salt-die
salt-die

Reputation: 854

Use np.argwhere to find the indices where a==9 and use the 0th column of those indices to index a:

In [171]: a = np.array([[1,2,9,4,2,5],[4,5,1,4,2,4],[2,3,6,7,8,9],[5,6,7,4,3,6]])
     ...: 
     ...: indices = np.argwhere(a==9)
     ...: a[indices[:,0]]
Out[171]: 
array([[1, 2, 9, 4, 2, 5],
       [2, 3, 6, 7, 8, 9]])

...or if you just need the row numbers just save indices[:,0]. If 9 can appear more than once per row and you don't want duplicate rows listed, you can use np.unique to filter your result (does nothing for this example):

In [173]: rows = indices[:,0]

In [174]: np.unique(rows)
Out[174]: array([0, 2])

Upvotes: 1

Related Questions