Reputation: 138
I am trying to compute distance metrics between two 2D arrays, let's say A and B (n 'rows' x 6 'cols' each) using the scipy.spatial.distance
functions. I would like to compute these distances between each pair of observations that correspond to the same index (e.g. between A[i,:] and B[i,:]) efficiently (i.e. without looping over the array index).
I know that scipy.spatial.distance.cdist
achieves this quickly, but between all pairs of observations, including those that not match in index. Therefore, I am looking for an quivalent but only for index-matching observations.
Here is an simple example for computing the euclidean distance:
import numpy as np
from scipy.spatial import distance
a = np.array([[1, 5, 6, 7, 8, 7, 9], [5, 7, 8, 6, 4, 1, 2]])
b = np.array([[9, 8, 9, 5, 7, 1, 2], [1, 5, 5, 7, 2, 1, 1]])
print(distance.cdist(a, b, 'euclidean')) # Compute the euclidean distance between each pair
for i in range(0, len(a), 1):
print(distance.euclidean(a[i,:], b[i,:])) # Do the job but too long
Thank you for your help!
Upvotes: 1
Views: 362
Reputation: 6475
You can use numpy.linalg.norm specifying the axis parameter:
import numpy as np
a = np.array([[1, 5, 6, 7, 8, 7, 9], [5, 7, 8, 6, 4, 1, 2]])
b = np.array([[9, 8, 9, 5, 7, 1, 2], [1, 5, 5, 7, 2, 1, 1]])
print(np.linalg.norm(a-b, axis=1))
[13.11487705 5.91607978]
Upvotes: 2