Using NumPy argsort and take in 2D arrays

Question

The aim is to calculate the distance matrix between two sets of points (set1 and set2), use argsort() to obtain the sorted indexes and take() to extract the sorted array. I know I could do a sort() directly, but I need the indexes for some next steps.

I am using the fancy indexing concepts discussed here. I could not manage to use take() directly with the obtained matrix of indexes, but adding to each row a corresponding quantity makes it work, because take() flattens the source array making the second row elements with an index += len(set2), the third row index += 2*len(set2) and so forth (see below):

dist  = np.subtract.outer( set1[:,0], set2[:,0] )**2
dist += np.subtract.outer( set1[:,1], set2[:,1] )**2
dist += np.subtract.outer( set1[:,2], set2[:,2] )**2
a = np.argsort( dist, axis=1 )
a += np.array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
               [10, 10, 10, 10, 10, 10, 10, 10, 10, 10],
               [20, 20, 20, 20, 20, 20, 20, 20, 20, 20],
               [30, 30, 30, 30, 30, 30, 30, 30, 30, 30]])
s1 = np.sort(dist,axis=1)
s2 = np.take(dist,a)
np.nonzero((s1-s2)) == False
#True # meaning that it works...

The main question is: is there a direct way to use take() without summing these indexes?

Data to play with:

set1 = np.array([[ 250., 0.,    0.],
                 [ 250., 0.,  510.],
                 [-250., 0.,    0.],
                 [-250., 0.,    0.]])

set2 = np.array([[  61.0, 243.1, 8.3],
                 [ -43.6, 246.8, 8.4],
                 [ 102.5, 228.8, 8.4],
                 [  69.5, 240.9, 8.4],
                 [ 133.4, 212.2, 8.4],
                 [ -52.3, 245.1, 8.4],
                 [-125.8, 216.8, 8.5],
                 [-154.9, 197.1, 8.6],
                 [  61.0, 243.1, 8.7],
                 [ -26.2, 249.3, 8.7]])

Other related questions:

- Euclidean distance between points in two different Numpy arrays, not within

Jaime · Accepted Answer

I don't think there is a way to use np.take without going to flat indices. Since dimensions are likely to change, you are better off using np.ravel_multi_index for that, doing something like this:

a = np.argsort(dist, axis=1)
a = np.ravel_multi_index((np.arange(dist.shape[0])[:, None], a), dims=dist.shape)

Alternatively, you can use fancy indexing without using take:

s2 = dist[np.arange(4)[:, None], a]

Using NumPy argsort and take in 2D arrays

Answers (2)

Related Questions