Reputation: 173
In scientific computation, a 3D field can be discretized as F[nx, ny, nz]
, where nx
, ny
and nz
are the numbers of grid points in 3 directions. In every point, assume we have n-by-n
tensor attached. So for the tensor field, we can use a 5D array to represent T[n, n, nx, ny, nz]
. The tensor for any point [i, j, k]
can be selected as T[:, :, i, j, k]
. If I want to compute sum of off-diagonal elements for each point, I would like to use code
import numpy as np
r = np.zeros((nx, ny, nz))
for i in range(nx):
for j in range(ny):
for k in range(nz):
r[i,j,k] = np.sum(T[:,:,i,j,k])-np.trace(T[:,:,i,j,k])
The result array r
and tensor field T
have different dimensions. The calculation of loop over every element is low efficient in Python. Is there any other way to do vectorized or efficient computation for arrays with different dimensions. Or what else data type/structure can be used.
Upvotes: 3
Views: 397
Reputation: 879093
Below are two different alternatives. The first uses ndarray.sum
and NumPy integer array indexing. The second alternative uses np.einsum
.
def using_sum(T):
total = T.sum(axis=1).sum(axis=0)
m = np.arange(T.shape[0])
trace = T[m, m].sum(axis=0)
return total - trace
def using_einsum(T):
return np.einsum('mnijk->ijk', T) - np.einsum('nnijk->ijk', T)
The first argument of np.einsum
specifies the subscripts of the summation.
'mnijk->ijk'
indicates that T
has subcripts mnijk
and only the ijk
subscripts remain after summation. Therefore the summation is performed over the m
and n
subscripts. This makes
np.einsum('mnijk->ijk', T)[i,j,k]
equal to np.sum(T[:,:,i,j,k])
, but computes the entire array in one vectorized computation.
Similarly, 'nnijk->ijk'
tells np.einsum
that T
has subscripts nnijk
and again only the ijk
subscripts survive summation. Therefore the summation is over n
. Since n
is repeated, the summation over n
computes the trace.
I like np.einsum
because it communicates the intent of the computation
succinctly. But it's also good to be familiar with how using_sum
works since
it uses fundamental NumPy operations. It's a good example of how nested loops
can be avoided by using NumPy methods which operate on whole arrays.
Here's a perfplot comparing the performance of orig
versus using_sum
and using_einsum
as a function of n
, where T
is taken to have shape (10, 10, n, n, n)
:
import perfplot
import numpy as np
def orig(T):
_, _, nx, ny, nz = T.shape
r = np.zeros((nx, ny, nz))
for i in range(nx):
for j in range(ny):
for k in range(nz):
r[i,j,k] = np.sum(T[:,:,i,j,k])-np.trace(T[:,:,i,j,k])
return r
def using_einsum(T):
r = np.einsum('mnijk->ijk', T) - np.einsum('nnijk->ijk', T)
return r
def using_sum(T):
total = T.sum(axis=1).sum(axis=0)
m = np.arange(T.shape[0])
trace = T[m,m].sum(axis=0)
return total - trace
def make_T(n):
return np.random.random((10,10,n,n,n))
perfplot.show(
setup=make_T,
kernels=[orig, using_sum, using_einsum],
n_range=range(2, 80, 3),
xlabel='n')
perfplot.show
also checks that the values returned by orig
, using_sum
and using_einsum
are equal.
Upvotes: 4