Miguel 2488
Miguel 2488

Reputation: 1440

Calculate euclidean distance from scratch between 3 numpy arrays

I have 3 huge numpy arrays, and i want to build a function that computes the euclidean distance pairwise from the points of one array to the points of the second and third array.

For the sake of simplicity suppose i have these 3 arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

c = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

I have tried this:

def correlation(x, y, t):
    from math import sqrt

    for a,b, in zip(x,y,t):
        distance = sqrt((x[a]-x[b])**2 + (y[a]-y[b])**2 + (t[a]-t[b])**2 )
    return distance

But this code throws an error: ValueError: too many values to unpack (expected 2)

How can i correctly implement this function using numpy or base python?

Thanks in advance

Upvotes: 1

Views: 1262

Answers (2)

Redone R
Redone R

Reputation: 89

First we define a function which computes the distance between every pair of rows of two matrices.

def pairwise_distance(f, s, keepdims=False):
    return np.sqrt(np.sum((f-s)**2, axis=1, keepdims=keepdims))

Second we define a function which calculate all possible distances between every pair of rows of the same matrix:

def all_distances(c):
    res = np.empty(shape=c.shape, dtype=float)
    for row in np.arange(c.shape[0]):
        res[row, :] = pairweis_distance(c[row], c) #using numpy broadcasting
    return res

Now we are done

row_distances = all_distances(a) #row wise distances of the matrix a
column_distances = all_distances(a) #column wise distances of the same matrix
row_distances[0,2] #distance between first and third row
row_distances[1,3] #distance between second and fourth row

Upvotes: 1

zabop
zabop

Reputation: 7852

Start with two arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

To calculate the distance between elements of these arrays you can do:

pairwise_dist_between_a_and_b=[(each**2+b[index]**2)**0.5 for index, each in enumerate(a)]

By doing so you get pairwise_dist_between_a_and_b:

[array([2.31931024e+00, 1.41421356e-03, 2.20617316e+00, 1.41421356e-01]),
 array([2.34193766e+00, 1.71119841e+00, 4.52548340e-01, 1.41421356e-04]),
 array([1.41449641e+00, 4.24264069e-04, 1.57119127e+00, 4.24264069e-04]),
 array([0.31536962, 0.94257334, 1.72831039, 2.3461803 ])]

You can use the same list comprehension for the first and third array.

Upvotes: 0

Related Questions