Reputation: 1418
Given, two 3-D arrays of dimensions (2,2,2):
A = [[[ 0, 0],
[92, 92]],
[[ 0, 92],
[ 0, 92]]]
B = [[[ 0, 0],
[92, 0]],
[[ 0, 92],
[92, 92]]]
How do you find the Euclidean distance for each vector in A and B efficiently?
I have tried for-loops but these are slow, and I'm working with 3-D arrays in the order of (>>2, >>2, 2).
Ultimately I want a matrix of the form:
C = [[d1, d2],
[d3, d4]]
Edit:
I've tried the following loop, but the biggest issue with it is that loses the dimensions I want to keep. But the distances are correct.
[numpy.sqrt((A[row, col][0] - B[row, col][0])**2 + (B[row, col][1] -A[row, col][1])**2) for row in range(2) for col in range(2)]
Upvotes: 6
Views: 5205
Reputation: 1815
I recommend being extremely careful when using custom squares and root instead of standard builtin math.hypot and np.hypot. These are fast and optimized and very safe.
In that sense np.linalg.norm(A-B, axis=-1)
here looks safest
For very large matrices, numpy using broadcast will become memory bound and slowdown. In such cases you would want to use for loops but dont compromise on speed. For that using numba will be good here
i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# 1.7320508075688773e+200
Refer: speed/overflow/underflow
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square
Upvotes: 0
Reputation: 221514
Thinking in a NumPy vectorized way that would be performing element-wise differentiation, squaring and summing along the last axis and finally getting square root. So, the straight-forward implementation would be -
np.sqrt(((A - B)**2).sum(-1))
We could perform the squaring and summing along the last axis in one go with np.einsum
and thus make it more efficient, like so -
subs = A - B
out = np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))
Another alternative with numexpr
module -
import numexpr as ne
np.sqrt(ne.evaluate('sum((A-B)**2,2)'))
Since, we are working with a length of 2
along the last axis, we could just slice those and feed it to evaluate
method. Please note that slicing isn't possible inside the evaluate string. So, the modified implementation would be -
a0 = A[...,0]
a1 = A[...,1]
b0 = B[...,0]
b1 = B[...,1]
out = ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')
Runtime test
Function definitions -
def sqrt_sum_sq_based(A,B):
return np.sqrt(((A - B)**2).sum(-1))
def einsum_based(A,B):
subs = A - B
return np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))
def numexpr_based(A,B):
return np.sqrt(ne.evaluate('sum((A-B)**2,2)'))
def numexpr_based_with_slicing(A,B):
a0 = A[...,0]
a1 = A[...,1]
b0 = B[...,0]
b1 = B[...,1]
return ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')
Timings -
In [288]: # Setup input arrays
...: dim = 2
...: N = 1000
...: A = np.random.rand(N,N,dim)
...: B = np.random.rand(N,N,dim)
...:
In [289]: %timeit sqrt_sum_sq_based(A,B)
10 loops, best of 3: 40.9 ms per loop
In [290]: %timeit einsum_based(A,B)
10 loops, best of 3: 22.9 ms per loop
In [291]: %timeit numexpr_based(A,B)
10 loops, best of 3: 18.7 ms per loop
In [292]: %timeit numexpr_based_with_slicing(A,B)
100 loops, best of 3: 8.23 ms per loop
In [293]: %timeit np.linalg.norm(A-B, axis=-1) #@dnalow's soln
10 loops, best of 3: 45 ms per loop
Upvotes: 5