geo.wolfer
geo.wolfer

Reputation: 41

Proper optimization of some calculation

Trying to optimize a piece of code with numpy, I am wondering if this is the right approach.

Here is the formula of the computation , the matrix being lower-triangular.

1

And here is my attempt:

(np.sum(P) - np.trace(P)) / np.sum(((t[np.newaxis]).T - t) * P)

Is this as good as it can get or can you see a more efficient way ?

Upvotes: 0

Views: 59

Answers (1)

Divakar
Divakar

Reputation: 221504

Bottleneck seems to be the computation of the denominator and seems like np.einsum should help there as we are performing element-wise multiplication and sum-reduction. Thus, the denominator could be computed like so -

np.einsum('ij,ij',t[:,None]-t, P)

Timings and verification -

In [414]: N = 5000
     ...: P = np.random.rand(N,N)
     ...: t = np.random.rand(N)
     ...: out = (np.sum(P) - np.trace(P)) / np.sum(((t[np.newaxis]).T - t) * P)
     ...: 

# Original method    
In [415]: den1 = np.sum(((t[np.newaxis]).T - t) * P)

# Proposed method    
In [416]: den2 = np.einsum('ij,ij',t[:,None]-t, P)

In [417]: np.allclose(den1, den2)
Out[417]: True

In [419]: %timeit np.sum(((t[np.newaxis]).T - t) *P)
10 loops, best of 3: 86.9 ms per loop

In [420]: %timeit np.einsum('ij,ij',t[:,None]-t, P)
10 loops, best of 3: 49.7 ms per loop

For the numerator, it seems most of the runtime is spent on np.sum(P) :

In [422]: %timeit (np.sum(P) - np.trace(P))
100 loops, best of 3: 10.4 ms per loop

In [423]: %timeit np.sum(P)
100 loops, best of 3: 10.4 ms per loop

So, we can leave the numerator as it is.

Upvotes: 1

Related Questions