Reputation: 45
I need to optimize this code.
# K sparse node coupling matrix
# P vector of net power production at each node
# alpha vector of damping at each node
# n number of dimensions
# theta vector of phases
# omega vector of angular velocities
def d_omega(K,P,alpha):
def layer1(n):
def layer2(theta,omega):
T = theta[:, None] - theta
T = np.sin(T)
A = K*T
A = A.dot(np.ones(n))
return - alpha*omega + P - A
return layer2
return layer1
I know a large portion of the computation time is used in the
T = np.sin(T)
line. Is there a faster way of doing this?
Upvotes: 2
Views: 298
Reputation: 25083
From the assignment
T0 = θ[:, None] - θ
we have that T0[i,j] = θ[i]-θ[j]
and from
T1 = sin(T0)
we have T1[i,j] = sin(θ[i]-θ[j])
but from simple trigonometry
T1[i,j] = sin(θ[i])*cos(θ[j]) - cos(θ[i])*sin(θ[j])
Recognizing that sin(θ[i])*cos(θ[j])
is an element of the matrix that results from the outer product of sin(θ)
and cos(θ)
we can write a little program that I've checked interactively
In [42]: from numpy import allclose, arange, cos, sin
...: x = arange(5)
...: s0 = sin(x[:,None]-x)
...: s1 = sin(x[:,None]) * cos(x)
...: s1 -= s1.T
...: allclose(s0,s1)
Out[42]: True
Upvotes: 0
Reputation: 53089
Use Euler's formula. That way we can reduce the number of expensive function evaluations from n^2
to 2n
(if we count complex functions double). What we are doing is expressing the sine of an outer difference as an outer product of complex exponentials. As multiplication is much cheaper than a sine this results in a good net speedup:
import numpy as np
from timeit import timeit
def pp():
T = np.exp(1j*theta)
T = np.outer(T,T.conj()).imag
return T
def orig():
T = theta[:, None] - theta
T = np.sin(T)
return T
rng = np.random.default_rng()
theta = rng.uniform(-np.pi,np.pi,1000)
print(np.allclose(orig(),pp()))
print(timeit(orig,number=10))
print(timeit(pp,number=10))
Sample run:
True # results are the same
0.2818465740419924 # original method
0.04591922601684928 # Euler
Speedup (depends on number of elements of theta, here 1000) ~6x
Depending on the shape of K
(scalar, 1D or 2D) we can optimize further:
def pp():
T = np.exp(1j*theta)
if K.ndim == 0:
A = T*(T.sum().conj()*K)
elif K.ndim == 1:
A = T * ([email protected]())
else:
A = T * ([email protected]())
return A.imag
def orig():
T = theta[:, None] - theta
T = np.sin(T)
A = K*T
A = A.dot(np.ones(1000))
return A
rng = np.random.default_rng()
theta = rng.uniform(-np.pi,np.pi,1000)
for K in (rng.uniform(-10,10,shp) for shp in ((),(1000,),(1000,1000))):
print(np.allclose(orig(),pp()))
print(timeit(orig,number=10))
print(timeit(pp,number=10))
Sample run:
True
0.493153132032603
0.0012746050488203764
True
0.49636399815790355
0.0012419759295880795
True
0.47554834792390466
0.05685037490911782
So with K
scalar or a vector we get a ~350x speedup, while if K
is a matrix it is only ~8x.
Upvotes: 2