Frank Seidl
Frank Seidl

Reputation: 170

Accelerate numpy calculation

I'm writing a Python function to integrate a vector field on a high-dimensional matrix space. A, shape (n, m), is a matrix whose time derivative is linear in each of its components A[i, j]. We can collect all of the coefficients of the derivative into a 4D array C such that C[i, j, k, l] is the coefficient of A[k, l] in the derivative of A[i, j]. In this case, the derivative of A is given by dA[i, j] == (C[i, j] * A).sum(). Thus it is correct to compute

dA = np.array([[ (Cij * A).sum() for Cij in Ci ] for Ci in C ])

Fortunately, C can be represented as a sparse.COO object so that the above requires only O(nm) multiplications. But the two for loops are still slow. Thanks to a helpful comment I improved this to

dA = (C * A).sum(axis=3).sum(axis=2)

leveraging broadcasting for a significant speedup. Can anyone go faster?

Upvotes: 2

Views: 448

Answers (1)

Naphat Amundsen
Naphat Amundsen

Reputation: 1623

You could use np.einsum to accelerate this even more, as you won't have to do any intermediate calculations. Or at least you could do (C * A).sum(axis=(2,3)) to remove one intermediate step.

import numpy as np
A = np.full((12,12), 2)
C = np.full((12,12,3,2), 1).T
dA = (C * A).sum(axis=3).sum(axis=2)
print(np.einsum('abkl,ijkl->ij', A[None, None], C) == dA)
print((C * A).sum(axis=(2,3)) == dA)

Output:

[[ True  True  True]
 [ True  True  True]]
[[ True  True  True]
 [ True  True  True]]

To be entirely honest, I don't completely understand your mathematical problem, and I am also not that good with einsum. That is, you should double-check that the algorithm and test case is correct :)

EDIT: Added .sum(axis=(2,3)) method

Upvotes: 1

Related Questions