Searching in numpy array

Question

I have a 2D numpy array, say A sorted with respect to Column 0. e.g.

Col.0	Col.1	Col.2
10	2.45	3.25
11	2.95	4
12	3.45	4.25
15	3.95	5
18	4.45	5.25
21	4.95	6
23	5.45	6.25
27	5.95	7
29	6.45	7.25
32	6.95	8
35	7.45	8.25

The entries in each row is unique i.e. Col. 0 is the identification number of a co-ordinate in xy plane, Columns 1 and 2 are x and y co-ordinates of these points. I have another array B (rows can contain duplicate data). Column 0 and Column 1 store x and y co-ordinates.

Col.0	Col.1
2.45	3.25
4.45	5.25
6.45	7.25
2.45	3.25

My aim is to find the row index number in array A corresponding to data in array B without using for loop. So, in this case, my output should be [0,4,8,0]. Now, I know that with numpy searchsorted lookup for multiple data can be done in one shot. But, it can be used to compare with a single column of A and not multiple columns. Is there a way to do this?

Naphat Amundsen · Accepted Answer

Pure numpy solution:

My intuition is that I take the difference c between a[:,1:] and b by broadcasting, such that c is of shape (11, 4, 2). The rows that match will be all zeros. Then I do c == False to obtain a mask. I do c.all(2) which results in a boolean array of shape (11, 4), where all True elements represents matches between a and b. Then I simply use np.nonzero to obtain the indices of said elements.

import numpy as np

a = np.array([
    [10, 2.45, 3.25],
    [11, 2.95, 4],
    [12, 3.45, 4.25],
    [15, 3.95, 5],
    [18, 4.45, 5.25],
    [21, 4.95, 6],
    [23, 5.45, 6.25],
    [27, 5.95, 7],
    [29, 6.45, 7.25],
    [32, 6.95, 8],
    [35, 7.45, 8.25],
])

b = np.array([
    [2.45, 3.25],
    [4.45, 5.25],
    [6.45, 7.25],
    [2.45, 3.25],
])

c = (a[:,np.newaxis,1:]-b) == False
rows, cols = c.all(2).nonzero()
print(rows[cols.argsort()])
# [0 4 8 0]

Searching in numpy array

Answers (2)

Pure numpy solution:

Related Questions