titto.sebastian
titto.sebastian

Reputation: 531

Internal List containing Dictionary Keys does not deallocating memory even after removing keys from the dictionary

I am using dictionaries in my python script written in python 2.7.

when i ran my script using pympler to find memory leakage,I found that the size of list object is increasing incredibly.

And these list contains the keys of these dictionary.Is these list in build implementation of python dictionary?

Output of pympler is as following:

                         types |   # objects |   total size
  ============================ | =========== | ============
                    list |       99221 |    106.53 MB
                     str |      105530 |      6.06 MB
                    dict |         602 |    940.48 KB
                    code |        1918 |    239.75 KB
                     int |       10043 |    235.38 KB
      wrapper_descriptor |        1120 |     87.50 KB
                    type |          87 |     76.80 KB
 builtin_function_or_method |         719 |     50.55 KB
       method_descriptor |         601 |     42.26 KB
                     set |         132 |     33.41 KB
                 weakref |         372 |     31.97 KB
                   tuple |         364 |     26.24 KB
     <class 'abc.ABCMeta |          20 |     17.66 KB
       member_descriptor |         233 |     16.38 KB
     function (__init__) |         114 |     13.36 KB

The size of list is increasing and the RES memory for the scipt is also incresing? The list contains keys of the dictionary used which which i think is the internal implementation of dictionary. Is the keys stored as list in memory?How could i manage this memory leakage?

Here is the code for the above output:

         from pympler import tracker
         from pympler import summary
         from pympler import muppy
         import types
         while(1):
          d={}
          for i in range(1000,1100):
             d[i]=1
          for i in range(1000,1050):
            del d[i]
          all_objects = muppy.get_objects()
          sum1 = summary.summarize(all_objects)
          summary.print_(sum1) 
          type = muppy.filter(all_objects, Type=types.ListType)
          print 'length :%s'%(len(type)) 

Upvotes: 0

Views: 197

Answers (1)

abarnert
abarnert

Reputation: 365975

Your problem is an artifact of testing things poorly.

In your loop, you do this:

all_objects = muppy.get_objects()

You never delete all_objects, so the previous list of all objects is still a live object at the time you call get_objects. Of course you drop that reference when you assign the new list to all_objects, but that's too late; the new list of all objects now has a reference to the old list of all objects.

This is actually a common problem with people using all kinds of in-process memory debugging tools—it's easy to end up keeping around artifacts of the tool and end up debugging those artifacts.

One nice way to make sure you don't accidentally do this kind of thing is to factor out your muppy code into a function:

def summarize():
    all_objects = muppy.get_objects()
    sum1 = summary.summarize(all_objects)
    summary.print_(sum1) 
    type = muppy.filter(all_objects, Type=types.ListType)
    print 'length :%s'%(len(type)) 

Now, in the loop, you can just call summarize(), and be sure that you can't have any local variables lying around that you didn't want.

Upvotes: 4

Related Questions