Reputation: 138
Considering the code below as ran in IDLE and Python3:
np.ones((1000,1000,1000), dtype=np.float64)
def func1(b):
b = np.ones((1000,1000,1000), dtype=np.float32)
def func2(b):
b = np.ones((1000,1000,1000), dtype=np.float32)
return b
Calling func1(a)
results in a brief memory spike which is immediately released by garbage collector, however calling func2(a)
will immediately spike memory and remain there (until gc.collect()
is explicitly called), even though the return value wasn't assigned to any variable. I have no other references to that value available anywhere else.
Making this even simpler:
def func1():
b = np.ones((1000,1000,1000), dtype=np.float32)
def func2():
b = np.ones((1000,1000,1000), dtype=np.float32)
return b
def func3():
return np.ones((1000,1000,1000), dtype=np.float32)
Then calling func1()
is the same as in the previous scenario, spiking the memory and immediately releasing it. Calling both func2()
and func3()
no memory is allocated at all, which is the opposite to the first scenario.
What is happening under the hood to explain this memory allocation behavior? How does the return
statement affect the allocation?
Upvotes: 2
Views: 42
Reputation: 362786
You mentioned you are running in IDLE.
The interactive session will save a reference to the last returned value in the _
variable, which means your numpy array has a reference count preventing it from being collected immediately. Deleting builtins._
should free the memory.
See sys.displayhook
docs for more info.
Upvotes: 2