Array fits comfortably within available RAM, but a memory error still occurs when calling numpy.take on it

Question

I have three arrays of shape (1029,1146,8,5). They are H4, rowOffsets, and colOffsets. H4 is float32 while the other two are int. Assuming 4 bytes per element array, H4 has a cost of 188.7 MB.

My machine has 32 GB RAM total, with 18 currently available. I used platform.architecture() to verify that the Python interpreter is 64 bit, so that RAM ought to be available.

It seems like I'm nowhere near the memory limit, yet I get a memory error when I run the following:

shifted=np.take(H4,rowOffsets,0,mode='clip').

I further tested this by running the code up to the Take call with a much larger input of (3000,3000,8,5). This consumed 7 times more memory yet also did not cause a memory error until the Take call.

So I figure I'm using Take wrong, there's a bug with it, or it consumes a massive amount of memory while executing. Can anyone help clarify what's happening here?

Paul Panzer · Accepted Answer

With multi-dimensional arguments take takes a full slice of all but the axis dimension for each entry in indices. Thus the way you use it the result would be 1029 * 1146**2 * 8**2 * 5**2 * itemsize which is a lot and explains your memory problems.

You probably want to use take_along_axis instead.

Array fits comfortably within available RAM, but a memory error still occurs when calling numpy.take on it

Answers (1)

Related Questions