Wenhui
Wenhui

Reputation: 61

numpy - ndarray - how to remove rows based on another array

I want to remove rows from a ndarray based on another array. for example:

k = [1,3,99]

n = [
  [1,'a']
  [2,'b']
  [3,'c']
  [4,'c']
  [.....]
  [99, 'a']
  [100,'e']
]

expect result:

out = [
  [2,'b']
  [4,'c']
  [.....]
  [100,'e']
]

the first column of the rows with the values in k will be removed

Upvotes: 3

Views: 1490

Answers (2)

Aakash Goel
Aakash Goel

Reputation: 1030

If your data structure is list, please find simple solution as below, however you can convert into list by list() method.

def check(list):
 k=[1,3,99]
 if(list[0] not in k): 
  return list

final_list = map(check,n)
final_list = final_list.remove(None)
print final_list

Upvotes: 0

Divakar
Divakar

Reputation: 221564

You can use np.in1d to create a mask of matches between the first column of n and k and then use the inverted mask to select the non-matching rows off n, like so -

n[~np.in1d(n[:,0].astype(int), k)]

If the first column is already of int dtype, skip the .astype(int) conversion step.

Sample run -

In [41]: n
Out[41]: 
array([['1', 'a'],
       ['2', 'b'],
       ['3', 'c'],
       ['4', 'c'],
       ['99', 'a'],
       ['100', 'e']], 
      dtype='|S21')

In [42]: k
Out[42]: [1, 3, 99]

In [43]: n[~np.in1d(n[:,0].astype(int), k)]
Out[43]: 
array([['2', 'b'],
       ['4', 'c'],
       ['100', 'e']], 
      dtype='|S21')

For peformance, if the first column is sorted, we can use np.searchsorted -

mask = np.ones(n.shape[0],dtype=bool)
mask[np.searchsorted(n[:,0], k)] = 0
out = n[mask]

Upvotes: 1

Related Questions