Preetom Saha Arko
Preetom Saha Arko

Reputation: 2748

Freeing large objects in Python

As far as I know, CPython garbage collection works only for objects equal to or smaller than 512 bytes. For large objects, CPython uses system calls.

import psutil
print(psutil.Process().memory_info().rss)
import gc


gc.disable()
x = [[0] * 1000 for _ in range(100000)]

print(psutil.Process().memory_info().rss)

x = 1

print(psutil.Process().memory_info().rss)

The output of the code comes as follows:

19038208

440745984

21344256

Here [[0] * 1000 for _ in range(100000)] is a large object, much larger than 512 bytes. So it shouldn't be collected by the gc module. So I tried commenting out gc.disable() and the output remains almost the same. This means gc module is not collecting the garbage object.

Now my question is, if gc is not collecting the large object, how is the memory utilization getting reduced? How is it getting identified that the object is a garbage and should be collected? And in which way the garbage object is being collected? Can actually the garbage object here be freed before the program terminates?

Upvotes: 3

Views: 964

Answers (1)

Tim Peters
Tim Peters

Reputation: 70592

Most garbage collection in CPython is handled by reference counting. The gc module is only needed to collect objects involved in reference cycles (which reference counting is incapable of detecting), but there are no such things in the program you posted. So gc is irrelevant.

But neither reference counting nor cyclic gc know anything about the sizes of objects. An object is trash or it isn't. That's all they care about. It's an object's deallocation function that deals with recycling the memory. In your program, chances are the "big" memory chunks are returned to the system C malloc family via calling C's free() function. Whether or not that shows up as reduced memory use via psutil is determined not by Python, but by how your platform C libraries interact with your OS.

To answer your last question, yes, your big object becomes trash, and its deallocation function is called, immediately after x = 1 executes. Because the reference count on the big object falls to 0 as soon as the new object 1 is bound to x (x held the only reference to the big object).

Upvotes: 3

Related Questions