Gabriel
Gabriel

Reputation: 42459

Improve performance of array handling

I have a large code which takes a bit of time to run. I've tracked down the two lines that take up most of the time and I'd like to know if there's a way to speed them up. Here's a MWE:

import numpy as np

def setup(k=2, m=100, n=300):
    return np.random.randn(k,m), np.random.randn(k,n),np.random.randn(k,m)
# make some random points and weights
a, b, w = setup()

# Weighted euclidean distance between arrays a and b.
wdiff = (a[np.newaxis,...] - b[np.newaxis,...].T) / w[np.newaxis,...]

# This is the set of operations that need a performance boost:
dist_1 = np.exp(-0.5*(wdiff*wdiff)) / w
dist_2 = np.array([i[0]*i[1] for i in dist_1])

I'm coming from this question BTW Fast weighted euclidean distance between points in arrays where ali_m suggested his amazing answer that saved me a lot of time by applying broadcasting (of which I know absolutely nothing, yet at least) Could something like that be applied with these lines?

Upvotes: 2

Views: 113

Answers (1)

DSM
DSM

Reputation: 353559

Your dist_2 calculation can be sped up by a factor of 10 or so:

>>> dist_1.shape
(300, 2, 100)
>>> %timeit dist_2 = np.array([i[0]*i[1] for i in dist_1])
1000 loops, best of 3: 1.35 ms per loop
>>> %timeit dist_2 = dist_1.prod(axis=1)
10000 loops, best of 3: 116 µs per loop
>>> np.allclose(np.array([i[0]*i[1] for i in dist_1]), dist_1.prod(axis=1))
True

I couldn't manage to do much with your dist_1 as the majority of time is spent in the exponentiation:

>>> %timeit (-0.5*(wdiff*wdiff)) / w
1000 loops, best of 3: 467 µs per loop
>>> %timeit np.exp((-0.5*(wdiff*wdiff)))/w
100 loops, best of 3: 3.3 ms per loop

Upvotes: 3

Related Questions