Or Y
Or Y

Reputation: 2118

Weird memory consumption of a long-running project as stated by guppy

I have a long-running Python project which has a slow but perpetual increment in memory usage. the main script sets up new processes (using multiprocessing) if it receives a lot of work to handle and both the main process and the children processes are running infinitely (Or until there target job is removed).

In order to try and understand the source of the memory leak, I used the guppy module to profile my memory, I then made each process to write the memory profile into a file every two minutes.

When checking the memory profile, I see that only one type is increasing in memory - list. The values in the rest of the types is basically static (sometimes a bit higher, sometimes a bit lower).

For example, here is a report from one of the processes:

 Index  Count   %     Size   % Cumulative  % Referrers by Kind (class / dict of class)
     0   3707  43   491717  58    491717  58 list
     1   4200  48   239659  28    731376  87 dict of CustomClass
     2    600   7    96000  11    827376  98 CustomClass
     3      4   0     4646   1    832022  98 _io.TextIOWrapper
     4     40   0     3664   0    835686  99 dict (no owner)
     5     46   1     3358   0    839044  99 urllib.parse.SplitResult
     6     90   1     2160   0    841204 100 dict of OtherCustomClass
     7     18   0     2080   0    843284 100 tuple
     8      1   0      424   0    843708 100 <Nothing>
     9      4   0      384   0    844092 100 types.FrameType

If any other type would have increased in count or size in memory, I could have probably tell which list is to blame of the memory leak but because only list is increasing in size, I am not sure how I should approach that and find which list is the origin of the leak.

Moreover, All of my list that belong to a class instance are either limited by size or being cleared every interval (And I made sure that clear is called). other list that I create a lists inside of functions that I expect to be deleted as soon as they are no longer in scope.

Am I missing something? Thank you very much

Upvotes: 0

Views: 258

Answers (1)

Tim Boddy
Tim Boddy

Reputation: 1069

Consider using https://github.com/vmware/chap

In your case you might want to do the following:

Start your process (not instrumented in any way).

Grab live cores for that process, for example by using gcore, sufficiently far apart that it shows growth.

For each core, open the core in chap and do the following:

redirect on
describe used

You will, as a result have two files (one for each core) that you can compare to see which allocations have are new or have disappeared. Your lists should be reflected there. Find an address for one of the lists and use "explain" to figure out how it is being used.

Upvotes: 0

Related Questions