jolaem
jolaem

Reputation: 138

How does Numpy deal with memory allocation when passed as arguments or returned?

SCENARIO 1

Considering the code below as ran in IDLE and Python3:

np.ones((1000,1000,1000), dtype=np.float64)

def func1(b):
    b = np.ones((1000,1000,1000), dtype=np.float32)

def func2(b):
    b = np.ones((1000,1000,1000), dtype=np.float32)
    return b

Calling func1(a) results in a brief memory spike which is immediately released by garbage collector, however calling func2(a) will immediately spike memory and remain there (until gc.collect() is explicitly called), even though the return value wasn't assigned to any variable. I have no other references to that value available anywhere else.

enter image description here

SCENARIO 2:

Making this even simpler:

def func1():
    b = np.ones((1000,1000,1000), dtype=np.float32)

def func2():
    b = np.ones((1000,1000,1000), dtype=np.float32)
    return b

def func3():
    return np.ones((1000,1000,1000), dtype=np.float32)

Then calling func1() is the same as in the previous scenario, spiking the memory and immediately releasing it. Calling both func2() and func3() no memory is allocated at all, which is the opposite to the first scenario.

What is happening under the hood to explain this memory allocation behavior? How does the return statement affect the allocation?

Upvotes: 2

Views: 42

Answers (1)

wim
wim

Reputation: 362786

You mentioned you are running in IDLE.

The interactive session will save a reference to the last returned value in the _ variable, which means your numpy array has a reference count preventing it from being collected immediately. Deleting builtins._ should free the memory.

See sys.displayhook docs for more info.

Upvotes: 2

Related Questions