Reputation: 1904
I'm getting a MemoryError
with numpy.where
but I'm not sure why. I can't post the actual code here, but below is a small working example that replicates the issue.
import numpy as np
dat = np.random.randn(100000, 1, 1, 1, 45, 2, 3)
# The following two steps seem superfluous but I wanted to replicate
# behaviour in the original code
cond = dat[:,0,0,0,0,0,0] > 0
cond = cond[:,None,None,None,None,None,None]
dat2 = np.where(cond, dat, 0)
dat[...,2] = np.where(cond, dat[...,2], dat2[...,2]) # Causes MemoryError
I understand that adding more memory to my computer will solve the issue, but I would like to understand what is going on here.
I expect the array slices above will not copy the array but only return a view, but I suppose that it is actually copying the array for some reason.
Upvotes: 1
Views: 891
Reputation: 1675
There is no "magic" going on here, your data array that you create using np.random.randn(100000, 1, 1, 1, 45, 2, 3)
is very large.
Numpy seems to store each number as 64 bit (8 byte) float, so your array takes up around 206 Megabytes of memory (100000 * 1 * 1 * 1 * 45 * 2 * 3 * 8).
/usr/bin/time -v python test.py
says that the program uses around 580 MB at its peak, which might be due to copying the object.
Upvotes: 1