Reputation: 170
I'm writing a Python function to integrate a vector field on a high-dimensional matrix space. A
, shape (n, m)
, is a matrix whose time derivative is linear in each of its components A[i, j]
. We can collect all of the coefficients of the derivative into a 4D array C
such that C[i, j, k, l]
is the coefficient of A[k, l]
in the derivative of A[i, j]
. In this case, the derivative of A
is given by dA[i, j] == (C[i, j] * A).sum()
. Thus it is correct to compute
dA = np.array([[ (Cij * A).sum() for Cij in Ci ] for Ci in C ])
Fortunately, C
can be represented as a sparse.COO object so that the above requires only O(nm) multiplications. But the two for loops are still slow. Thanks to a helpful comment I improved this to
dA = (C * A).sum(axis=3).sum(axis=2)
leveraging broadcasting for a significant speedup. Can anyone go faster?
Upvotes: 2
Views: 448
Reputation: 1623
You could use np.einsum to accelerate this even more, as you won't have to do any intermediate calculations. Or at least you could do (C * A).sum(axis=(2,3))
to remove one intermediate step.
import numpy as np
A = np.full((12,12), 2)
C = np.full((12,12,3,2), 1).T
dA = (C * A).sum(axis=3).sum(axis=2)
print(np.einsum('abkl,ijkl->ij', A[None, None], C) == dA)
print((C * A).sum(axis=(2,3)) == dA)
Output:
[[ True True True]
[ True True True]]
[[ True True True]
[ True True True]]
To be entirely honest, I don't completely understand your mathematical problem, and I am also not that good with einsum
. That is, you should double-check that the algorithm and test case is correct :)
EDIT: Added .sum(axis=(2,3))
method
Upvotes: 1