Néstor
Néstor

Reputation: 585

Memory blowing up when filling numpy array

I'm trying to figure out a python memory-handling issue I'm having when filling an array. I’m filling a huge multi-dimensional array of length [2048,3000,256,76], which I already created so its memory is already allocated. I fill it in a for loop with random numbers like so:

import numpy as np

myarray = np.zeros((2048,3000,256,76))
for i in range(2048):
    myarray[i,:,:,:] = np.random.normal(0.,1.,[3000,256,76])

However, if I see the memory the process is using, it keeps increasing steadily up to a point in which I have to kill it, I presume because the previous calls to np.random.normal (whose values I have already stored on myarray) are not disposed of. How can I get rid of them? Is it possible? I’ve tried running the garbage collector, but that didn’t work.

I realize this is a rather basic question, but all my memory allocation skills come from C. There it was just a matter of freeing arrays/vectors to not fall into problems like this, but I don’t know how to translate those skills to object-calling/creation disposal other than del and gc calls.

Thanks in advance for any pointers (pun intended)!

PS: This is just a toy code snippet of a larger problem. My actual problem has to do with multithreading, but this can shine some light into that problem.

Upvotes: 2

Views: 546

Answers (1)

Eric
Eric

Reputation: 97571

Your array is huge. 891 GiB of huge to be precise. On my system, windows, I get a MemoryError:

>>> myarray = np.zeros((2048,3000,256,76))
MemoryError: Unable to allocate 891. GiB for an array with shape (2048, 3000, 256, 76) and data type float64

which I already created so its memory is already allocated.

This unfortunately isn't true. On systems which aren't windows, I believe the OS does not perform the allocation until you replace the zeros with real data, which is why your memory usage keeps climbing.

Upvotes: 3

Related Questions