G.L
G.L

Reputation: 138

Scipy distance: Computation between each index-matching observations of two 2D arrays

I am trying to compute distance metrics between two 2D arrays, let's say A and B (n 'rows' x 6 'cols' each) using the scipy.spatial.distance functions. I would like to compute these distances between each pair of observations that correspond to the same index (e.g. between A[i,:] and B[i,:]) efficiently (i.e. without looping over the array index).

I know that scipy.spatial.distance.cdist achieves this quickly, but between all pairs of observations, including those that not match in index. Therefore, I am looking for an quivalent but only for index-matching observations.

Here is an simple example for computing the euclidean distance:

import numpy as np
from scipy.spatial import distance

a = np.array([[1, 5, 6, 7, 8, 7, 9], [5, 7, 8, 6, 4, 1, 2]])
b = np.array([[9, 8, 9, 5, 7, 1, 2], [1, 5, 5, 7, 2, 1, 1]])

print(distance.cdist(a, b, 'euclidean')) # Compute the euclidean distance between each pair

for i in range(0, len(a), 1):
    print(distance.euclidean(a[i,:], b[i,:])) # Do the job but too long

Thank you for your help!

Upvotes: 1

Views: 362

Answers (1)

FBruzzesi
FBruzzesi

Reputation: 6475

You can use numpy.linalg.norm specifying the axis parameter:

import numpy as np

a = np.array([[1, 5, 6, 7, 8, 7, 9], [5, 7, 8, 6, 4, 1, 2]])
b = np.array([[9, 8, 9, 5, 7, 1, 2], [1, 5, 5, 7, 2, 1, 1]])

print(np.linalg.norm(a-b, axis=1))
[13.11487705  5.91607978]

Upvotes: 2

Related Questions