林果皞
林果皞

Reputation: 7803

python - gc unreachable when reload()

I have this code, save as so.py:

import gc
gc.set_debug(gc.DEBUG_STATS|gc.DEBUG_LEAK)

class GUI():
    #########################################
    def set_func(self):
        self.functions = {}
        self.functions[100] = self.userInput
    #########################################
    def userInput(self):
        a = 1
g = GUI()
g.set_func()
print gc.collect()
print gc.garbage

And this is the output: enter image description here

I have two questions:

  1. Why gc.collect() does not reports unreachable when first time import? Instead it reports unreachable only when reload().

  2. Is there any quick way to fix this function mapping circular reference, i.e self.functions[100] = self.userInput ? Because my old project have a lot of this function mapping circular reference and i'm looking for a quick way/one line to change this codes. Currently what i do is "del g.functions" for all this functions at the end.

Upvotes: 2

Views: 678

Answers (2)

Bakuriu
Bakuriu

Reputation: 101999

  1. The first time you import the module nothing is being collected because you have a reference to the so module and all other objects are referenced by it, so they are all alive and the garbage collector has nothing to collect.

    When you reload(so) what happens is that the module is reexecuted, overriding all previous references and thus now the old values don't have any reference anymore.

    You do have a reference cycle in:

     self.functions[100] = self.userInput
    

    since self.userInput is a bound method it has a reference to self. So now self has a reference to the functions dictionary which has a reference to the userInput bound method which has a reference to self and the gc will collect those objects.

  2. It depends by what you are trying to do. From your code is not clear how you are using that self.functions dictionary and depending on that different options may be viable.

    The simplest way to break the cycle is to simply not create the self.functions attribute, but pass the dictionary around explicitly.

    If self.functions only references bound methods you could store the name of the methods instead of the method itself:

    self.functions[100] = self.userInput.__name__
    

    and then you can call the method doing:

    getattr(self, self.functions[100])()
    

    or you can do:

    from operator import methodcaller
    call_method = methodcaller(self.functions[100])
    call_method(self)   # calls self.userInput()
    

I don't really understand what do you mean by "Currently what i do is del g.functions for all this functions at the end." Which functions are you talking about?

Also, is this really a problem? Are you experience a real memory leak?

Note that the garbage collector reports the objects as unreachable not as uncollectable. This means that the objects are freed even if they are part of a reference cycle. So no memory leak should happen.

In fact adding del g.functions is useless because the objects are going to be freed anyway, so the one line fix is to simply remove all those del statements, since they don't do anything at all.

The fact that they are put into gc.garbage is because gc.DEBUG_LEAK implies the flag GC.DEBUG_SAVEALL which makes the collector put all unreachable objects into the garbage and not just the uncollectable ones.

Upvotes: 5

aecolley
aecolley

Reputation: 2011

  1. The nature of reload is that the module is re-executed. The new definitions supersede the old ones, so the old values become unreachable. By contrast, on the first import, there are no superseded definitions, so naturally there is nothing to become unreachable.
  2. One way is to pass the functions object as a parameter to set_func, and do not assign it as an instance attribute. This will break the cycle while still allowing you to pass the functions object to where it's needed.

Upvotes: 1

Related Questions