Optimizing pairwise calculation of distances an array for a given shift

Question

I have an array containing millions of entries. I would like to calculate a another vector, containing all of the distances, for pairs of entries, that are shifted by a certain number delta in the array.

Actually I'm using this:

for i in range(0, len(a) - delta):
    difs = numpy.append(difs, a[i + self.delta] - a[i])

Does anyone know how to do this faster?

There's a similar question here: Fastest pairwise distance metric in python

But I don't want to calculate the distance for every pair.

Example:

>>> a = [1,5,7,7,2,6]
>>> delta = 2
>>> print difs
array([ 6.,  2., -5., -1.])

Alex Riley · Accepted Answer

You could just slice a using delta and then subtract the two subarrays:

>>> a = np.array([1,5,7,7,2,6])
>>> delta = 2
>>> a[delta:] - a[:-delta]
array([ 6,  2, -5, -1])

This slicing operation is likely to be very quick for large arrays as no additional indexes or copies of the data in a needs to be created. The subtraction creates a new array with the required values in.

Optimizing pairwise calculation of distances an array for a given shift

Answers (2)

Related Questions