Reputation: 1363
I am using scipy
and its cdist
function to compute a distance matrix from an array of vectors.
import numpy as np
from scipy.spatial import distance
vectorList = [(0, 10), (4, 8), (9.0, 11.0), (14, 14), (16, 19), (25.5, 17.5), (35, 16)]
#Convert to numpy array
arr = np.array(vectorList)
#Computes distances matrix and set self-comparisons to NaN
d = distance.cdist(arr, arr)
np.fill_diagonal(d, None)
Let's say I want to return all the distances that are below a specific threshold (6
for example)
#Find pairs of vectors whose separation distance is < 6
id1, id2 = np.nonzero(d<6)
#id1 --> array([0, 1, 1, 2, 2, 3, 3, 4])
#id2 --> array([1, 0, 2, 1, 3, 2, 4, 3])
I now have 2 arrays of indices.
Question: how can I return the distances between these pairs of vectors as an array / list ?
4.47213595499958 #d[0][1]
4.47213595499958 #d[1][0]
5.830951894845301 #d[1][2]
5.830951894845301 #d[2][1]
5.830951894845301 #d[2][2]
5.830951894845301 #d[3][2]
5.385164807134504 #d[3][4]
5.385164807134504 #d[4][3]
d[id1][id2]
returns a matrix, not a list, and the only way I found so far is to iterate over the distance matrix again which doesn't make sense.
np.array([d[i1][i2] for i1, i2 in zip(id1, id2)])
Upvotes: 1
Views: 151
Reputation: 1106
Use
d[id1, id2]
This is the form that numpy.nonzero example shows (i.e. a[np.nonzero(a > 3)]
) which is different from the d[id1][id2]
you are using.
See arrays.indexing for more details on numpy indexing.
Upvotes: 2