Reputation: 900
Consider this code snippet:
import gc
from weakref import ref
def leak_class(create_ref):
class Foo(object):
# make cycle non-garbage collectable
def __del__(self):
pass
if create_ref:
# create a strong reference cycle
Foo.bar = Foo()
return ref(Foo)
# without reference cycle
r = leak_class(False)
gc.collect()
print r() # prints None
# with reference cycle
r = leak_class(True)
gc.collect()
print r() # prints <class '__main__.Foo'>
It creates a reference cycle that cannot be collected, because the referenced instance has a __del__
method. The cycle is created here:
# create a strong reference cycle
Foo.bar = Foo()
This is just a proof of concept, the reference could be added by some external code, a descriptor or anything. If that's not clear to you, remember that each objects mantains a reference to its class:
+-------------+ +--------------------+
| | Foo.bar | |
| Foo (class) +------------>| foo (Foo instance) |
| | | |
+-------------+ +----------+---------+
^ |
| foo.__class__ |
+--------------------------------+
If I could guarantee that Foo.bar
is only accessed from Foo
, the cycle wouldn't be necessary, as theoretically the instance could hold only a weak reference to its class.
Can you think of a practical way to make this work without a leak?
As some are asking why would external code modify a class but can't control its lifecycle, consider this example, similar to the real-life example I was working to:
class Descriptor(object):
def __get__(self, obj, kls=None):
if obj is None:
try:
obj = kls._my_instance
except AttributeError:
obj = kls()
kls._my_instance = obj
return obj.something()
# usage example #
class Example(object):
foo = Descriptor()
def something(self):
return 100
print Example.foo
In this code only Descriptor
(a non-data descriptor) is part of the API I'm implementing. Example
class is an example of how the descriptor would be used.
Why does the descriptor store a reference to an instance inside the class itself? Basically for caching purposes. Descriptor
required this contract with the implementor: it would be used in any class assuming that
something
here).It doesn't assume anything about:
Moreover the API was designed to avoid any extra load on the class implementor. I could have moved the responsibility for caching the object to the implementor, but I wanted a standard behavior.
There actually is a simple solution to this problem: make the default behavior to cache the instance (like it does in this code) but allow the implementor to override it if they have to implement __del__
.
Of course this wouldn't be as simple if we assumed that the class state had to be preserved between calls.
As a starting point, I was coding a "weak object", an implementation of object
that only kept a weak reference to its class:
from weakref import proxy
def make_proxy(strong_kls):
kls = proxy(strong_kls)
class WeakObject(object):
def __getattribute__(self, name):
try:
attr = kls.__dict__[name]
except KeyError:
raise AttributeError(name)
try:
return attr.__get__(self, kls)
except AttributeError:
return attr
def __setattr__(self, name, value):
# TODO: implement...
pass
return WeakObject
Foo.bar = make_proxy(Foo)()
It appears to work for a limited number of use cases, but I'd have to reimplement the whole set of object
methods, and I don't know how to deal with classes that override __new__
.
Upvotes: 1
Views: 295
Reputation: 251363
For your example, why don't you store _my_instance
in a dict on the descriptor class, rather than on the class holding the descriptor? You could use a weakref or WeakValueDictionary in that dict, so that when the object disappears the dict will just lose its reference and the descriptor will create a new one on the next access.
Edit: I think you have a misunderstanding about the possibility of collecting the class while the instance lives on. Methods in Python are stored on the class, not the instance (barring peculiar tricks). If you have an object obj
of class Class
, and you allowed Class
to be garbage collected while obj
still exists, then calling a method obj.meth()
on the object would fail, because the method would have disappeared along with the class. That is why your only option is to weaken your class->obj reference; even if you could make objects weakly reference their class, all it would do is break the class if the weakness ever "took effect" (i.e., if the class were collected while an instance still existed).
Upvotes: 2
Reputation: 64308
The problem you're facing is just a special case of the general ref-cycle-with-__del__
problem.
I don't see anything unusual in the way the cycles are created in your case, which is to say, you should resort to the standard ways of avoiding the general problem.
I think implementing and using a weak object
would be hard to get right, and you would still need to remember to use it in all places where you define __del__
. It doesn't sound like the best approach.
Instead, you should try the following:
__del__
in your class (recommended)__del__
, avoid reference cycles (in general, it might be hard/impossible to make sure no cycles are created anywhere in your code. In your case, seems like you want the cycles to exist)del
(if there are appropriate points to do that in your code)gc.garbage
list, and explicitly break reference cycles (using del
)Upvotes: 1