user4230219
user4230219

Reputation: 15

How can i search a array from a large array by numpy

I am beginning at numpy! Has numpy some function can search an array from another one ,and return the similar ones? Thanks!

import numpy as np

def searchBinA(B = ['04','22'],A):
    result = []
    ?......?  numpy.search(B,A)?   "is this correct?"
    return result

A = [['03', '04', '18', '22', '25', '29','30'], ['02', '04', '07', '09', '14', '29','30'], \
          ['06', '08', '11', '13', '17', '19','30'], ['04', '08', '22', '23', '27', '29','30'], \
          ['03', '05', '15', '22', '24', '25','30']]

print(str(searchBinA()))


output:[['03', '04', '18', '22', '25', '29','30'], ['04', '08', '22', '23', '27', '29','30']]

Upvotes: 1

Views: 64

Answers (1)

Divakar
Divakar

Reputation: 221574

Assuming the inputs are NumPy arrays and that there are no duplicates within each row of A, here's an approach using np.in1d -

A[np.in1d(A,B).reshape(A.shape).sum(1) == len(B)]

Explanation -

  1. Get a mask of matches in A against any element in B with np.in1d(A,B). Note that this would be a 1D boolean array.

  2. Reshape the boolean array obtained from np.in1d(A,B) to A's shape and then look for rows that have n matches for each row, where n is the number of elements in B. Since, there are unique elements within each row, the rows with n matches are the rows we want in the final output.

  3. Therefore, sum the 2D reshaped boolean array along the rows and compare against n giving us a boolean mask, which when indexed into A would give us selective rows from it as the desired output.

Sample run -

In [23]: A
Out[23]: 
array([['03', '04', '18', '22', '25', '29', '30'],
       ['02', '04', '07', '09', '14', '29', '30'],
       ['06', '08', '11', '13', '17', '19', '30'],
       ['04', '08', '22', '23', '27', '29', '30'],
       ['03', '05', '15', '22', '24', '25', '30']], 
      dtype='|S2')

In [24]: B
Out[24]: 
array(['04', '22'], 
      dtype='|S2')

In [25]: A[np.in1d(A,B).reshape(A.shape).sum(1) == len(B)]
Out[25]: 
array([['03', '04', '18', '22', '25', '29', '30'],
       ['04', '08', '22', '23', '27', '29', '30']], 
      dtype='|S2')

Upvotes: 1

Related Questions