Fastest way to compute distances between consecutive vectors with numpy/scipy

Question

I have a line growth algorithm where I need to:

calculate the distances (euclidean) between consecutive vectors in an array
insert a new vector where a distance is greater than a specific threshold

I usually do this in a very naive manner (see code below) and would like to know how to compute distances between consecutive vectors the fastest way possible using numpy (and scipy if needed).

import math


threshold = 10
vectorList = [(0, 10), (4, 8), (14, 14), (16, 19), (35, 16)]

for i in xrange(len(vectorList)):
    p1 = vectorList[i]
    p2 = vectorList[i+1]
    d = math.sqrt((p2[0] - p1[0])**2 + (p2[1] - p1[1])**2)
    if d >= threshold:
        pmid = ((p1[0] + p2[0]) * .5, (p1[1] + p2[1]) * .5)
        vectorList.insert(i+1, pmid)

EDIT: I've come up with the following workaround but I'm still concerned with the distance computation.

I would need to compute the distances between a vector and its next neighbor in a list instead of calculating a whole distance matrix (all vectors against each others) as I'm doing here.

import numpy as np

vectorList = [(0, 10), (4, 8), (14, 14), (16, 19), (35, 16)]
arr = np.asarray(vectorList).astype(float)


dis = distance.cdist(arr, arr).diagonal(1)
idx = np.where(dis > 10)[0]
vec = (arr[idx] + arr[idx+1]) * .5
arr = np.insert(arr, idx+1, vec, 0)

# output
array([[ 0. , 10. ],[ 4. ,  8. ],[ 9. , 11. ],[14. , 14. ],[16. , 19. ],[25.5, 17.5],[35. , 16. ]])

hpaulj · Accepted Answer

In [209]: vectorList = [(0, 10), (4, 8), (14, 14), (16, 19), (35, 16), (39,50)] 
In [210]: vectorList                                                            
Out[210]: [(0, 10), (4, 8), (14, 14), (16, 19), (35, 16), (39, 50)]

I added a point, making 3 possible insertion points.

In [212]: np.diff(vectorList, axis=0)                                           
Out[212]: 
array([[ 4, -2],
       [10,  6],
       [ 2,  5],
       [19, -3],
       [ 4, 34]])
In [213]: np.sum(np.diff(vectorList, axis=0)**2,1)                              
Out[213]: array([  20,  136,   29,  370, 1172])

distances:

In [214]: np.sqrt(np.sum(np.diff(vectorList, axis=0)**2,1))                     
Out[214]: array([ 4.47213595, 11.66190379,  5.38516481, 19.23538406, 34.23448554])

Mean values:

In [216]: arr = np.array(vectorList)                                            
In [217]: arr                                                                   
Out[217]: 
array([[ 0, 10],
       [ 4,  8],
       [14, 14],
       [16, 19],
       [35, 16],
       [39, 50]])

In [218]: (arr[:-1]+arr[1:])/2                                                  
Out[218]: 
array([[ 2. ,  9. ],
       [ 9. , 11. ],
       [15. , 16.5],
       [25.5, 17.5],
       [37. , 33. ]])

I could do something similar without diff:

d = np.sqrt(np.sum((arr[1:]-arr[:-1])**2,1))

The jumps that exceed the threshhold:

In [224]: idx = np.nonzero(d>10)                                                
In [225]: idx                                                                   
Out[225]: (array([1, 3, 4]),)
In [227]: _218[idx]       # the mean values to insert                                                            
Out[227]: 
array([[ 9. , 11. ],
       [25.5, 17.5],
       [37. , 33. ]])

Use np.insert to insert all values.

In [232]: np.insert(arr.astype(float), idx[0]+1, _227, axis=0)                  
Out[232]: 
array([[ 0. , 10. ],
       [ 4. ,  8. ],
       [ 9. , 11. ],
       [14. , 14. ],
       [16. , 19. ],
       [25.5, 17.5],
       [35. , 16. ],
       [37. , 33. ],
       [39. , 50. ]])

Fastest way to compute distances between consecutive vectors with numpy/scipy

Answers (2)

Related Questions