TomCho
TomCho

Reputation: 3507

Avoid memory errors when integrating large array with Numpy

I have a 601x350x200x146 numpy float64 array which, according to my calculations takes about 22.3 Gb of memory. My output of free -m tells me I have about 100Gb of free memory so it fits fine. However, when integrating with

result = np.trapz(large_arr, axis=3)

I get a memory error. I understand that this is because of the intermediate arrays that numpy.trapz has to create to perform the integration. But I'm looking to see if there's a way around it, or at least a way to minimize the extra use of memory.

I have read about memory errors and I know of things to avoid this: one is placing a gc.collect() call before the integration. I tried this and it didn't work.

The other one is using the *= operators such as writing arr*=a instead of arr=arr*a, which I can't really do here. So I don't know what else to try.

Does anyone know of a way to do this operation without raising a memory error?

You can reproduce the error with:

arr = np.ones((601,350,200,146), dtype=np.float64)
arr=np.trapz(arr, axis=3)

although you'll have to scale down the size to match your memory size.

Upvotes: 0

Views: 570

Answers (1)

Warren Weckesser
Warren Weckesser

Reputation: 114831

numpy.trapz provides some convenience, but the actual calculation is very simple. To avoid large temporary arrays, just implement it yourself:

In [37]: x.shape
Out[37]: (2, 4, 4, 10)

Here's the result of numpy.trapz(x, axis=3):

In [38]: np.trapz(x, axis=3)
Out[38]: 
array([[[ 43. ,  48.5,  46.5,  67. ],
        [ 35.5,  39.5,  52.5,  35. ],
        [ 44.5,  47.5,  34.5,  39.5],
        [ 54. ,  40. ,  46.5,  50.5]],

       [[ 42. ,  60. ,  55.5,  51. ],
        [ 51.5,  40. ,  52. ,  42.5],
        [ 48.5,  43. ,  32. ,  36.5],
        [ 42.5,  38. ,  38. ,  45. ]]])

Here's the calculation written to use no large intermediate arrays. (The slice x[:,:,:,1:-1] does not copy the data associated with the array.)

In [48]: 0.5*(x[:,:,:,0] + 2*x[:,:,:,1:-1].sum(axis=3) + x[:,:,:,-1])
Out[48]: 
array([[[ 43. ,  48.5,  46.5,  67. ],
        [ 35.5,  39.5,  52.5,  35. ],
        [ 44.5,  47.5,  34.5,  39.5],
        [ 54. ,  40. ,  46.5,  50.5]],

       [[ 42. ,  60. ,  55.5,  51. ],
        [ 51.5,  40. ,  52. ,  42.5],
        [ 48.5,  43. ,  32. ,  36.5],
        [ 42.5,  38. ,  38. ,  45. ]]])

If x has shape (m, n, p, q), the few temporary arrays that are generated in that expression all have shape (m, n, p).

Upvotes: 4

Related Questions