Reputation: 968
Is this a memory leak in the third party library I'm using, or is there something about Python garbage collection and memory management that I do not understand?
At the end, I would assume the memory usage to be close to what it was in the beginning (33MB), because I don't have any references to the objects that were created inside do_griddly_work()
. However, the memory usage is way higher (1600MB) and does not drop after exiting the function or collecting garbage.
This is printed
Before any work: 33.6953125 MB
After griddly work: 1601.60546875 MB
0
After garbage collect: 1601.60546875 MB
by the following code
from griddly import GymWrapperFactory, gd, GymWrapper
import gc
import os, psutil
def do_griddly_work():
current_path = os.path.dirname(os.path.realpath(__file__))
env = GymWrapper(current_path + '/griddly_descriptions/testbed1.yaml',
player_observer_type=gd.ObserverType.VECTOR,
global_observer_type=gd.ObserverType.SPRITE_2D,
level=0)
env.reset()
for _ in range(10000):
c_env = env.clone()
# Print memory usage after work
print('After griddly work: ', psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2, 'MB')
if __name__ == '__main__':
process = psutil.Process(os.getpid())
# Memory usage before work = ~33MB
print('Before any work: ', process.memory_info().rss / 1024 ** 2, 'MB')
# Do work that clones an environment a lot
do_griddly_work()
# Collect garbage
print(gc.collect())
# Memory usage after work = ~1600 MB
print('After garbage collect: ', process.memory_info().rss / 1024 ** 2, 'MB')
Upvotes: 1
Views: 1445
Reputation: 968
The problem was solved by the author of the Griddly library that I am using. The cloned environments weren't reachable by Python garbage collection and there was a memory leak in the underlying C++ implementation.
Upvotes: 1
Reputation: 308081
Most languages, including Python, are under no obligation to release memory back to the OS once an object is destroyed. In fact because OS allocations are generally made in blocks that are much larger than a single object, that block will contain multiple objects and if any of those are still live it will be impossible to return it to the OS.
memory_info().rss
is reporting the used memory from the OS point of view, not the Python runtime's.
Upvotes: 1