digitaldingo
digitaldingo

Reputation: 519

Efficient iteration over 3D array?

I am using Python and Numpy to do some data analysis.

I have a large 3D matrix (NxNxN), where each cell is again a matrix, this time a 3x3 matrix. Calling the matrix data, it looks like this:

data[N,N,N,3,3]  

I need to find the eigenvalues of all the 3x3 matrices, and for that I use Numpy's eigvals routine, but it takes ages to do. Right now I pretty much do this:

for i in range(N):
    for j in range(N):
        for k in range(N):
            a = np.linalg.eigvals(data[i,j,k,:,:])

For N = 256, this takes about an hour. Any ideas on how to make this more efficient?

Many thanks for any suggestions!

Upvotes: 5

Views: 5834

Answers (3)

HYRY
HYRY

Reputation: 97261

since all the calculation are independent, you can use multiprocessing module to speed up the calculation if you have a multi-core processor.

Upvotes: 2

senderle
senderle

Reputation: 150957

itertools.product is nicer than nested loops, aesthetically speaking. But I don't think it will make your code that much faster. My testing suggests that iteration is not your bottleneck.

>>> bigdata = numpy.arange(256 * 256 * 256 * 3 * 3).reshape(256, 256, 256, 3, 3)
>>> %timeit numpy.linalg.eigvals(bigdata[100, 100, 100, :, :])
10000 loops, best of 3: 52.6 us per loop

So underestimating:

>>> .000052 * 256 * 256 * 256 / 60
14.540253866666665

That's 14 minutes minimum on my computer, which is pretty new. Let's see how long the loops take...

>>> def just_loops(N):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 pass
... 
>>> %timeit just_loops(256)
1 loops, best of 3: 350 ms per loop

Orders of magnitude smaller, as DSM said. Even the work of slicing the array alone is more substantial:

>>> def slice_loops(N, data):
...     for i in xrange(N):
...         for j in xrange(N):
...             for k in xrange(N):
...                 data[i, j, k, :, :]
... 
>>> %timeit slice_loops(256, bigdata)
1 loops, best of 3: 33.5 s per loop

Upvotes: 5

agf
agf

Reputation: 176740

I'm sure there is a good way to do this in NumPy, but in general, itertools.product is faster than nested loops over ranges.

from itertools import product

for i, j, k in product(xrange(N), xrange(N), xrange(N)):
    a = np.linalg.eigvals(data[i,j,k,:,:])

Upvotes: 4

Related Questions