Guohua Cheng
Guohua Cheng

Reputation: 963

Circular reference in python

I am not sure how python deal with circular reference (reference cycle). I check some answers and found this:

Python's standard reference counting mechanism cannot free cycles, so the structure in your example would leak.
The supplemental garbage collection facility, however, is enabled by default and should be able to free that structure, if none of its components are reachable from the outside anymore and they do not have __del__() methods.

I guess it means that if none of the instances in reference cycle are reachable outside, both of them will be cleaned up. Is this true?
On the other hand, there is a package weakref which is often used to deal with map dictionary. The purpose of its existence, I suppose, is to avoid reference cycle.
In summary, can python deal with reference cycle automatically? If they can, why do we have to use weakref?

Upvotes: 4

Views: 7932

Answers (3)

Yilmaz
Yilmaz

Reputation: 49293

Variables are memory references.

 my_var=10 

this is stored in the one of the memory slots. my_var actually references the address of the memory slot where the 10 is stored. if you type:

id(my_var) 

you will get the address of the slot in base-10. hex(id(my_var) ) will give the hex representation of the address.

Whenever we use my_var python memory manager goes to the memory and retrieves the value of 10. Python memory manager also keeps track of the number of references for this memory slot. if there is no reference to this memory address, python memory manager destroys this object, and uses this memory slot for new objects.

imagine we have two classes:

class A:
    def __init__(self):
        self.b = B(self)
        print('A: self: {0}, b:{1}'.format(hex(id(self)), hex(id(self.b))))

class B:
    def __init__(self, a):
        self.a = a
        print('B: self: {0}, a: {1}'.format(hex(id(self)), hex(id(self.a))))

when you define an instance of class A:

   my_var = A()

you will get this printed: (in your system, you will have different addresses)

B: self: 0x1fc1eae44e0, a: 0x1fc1eae4908
A: self: 0x1fc1eae4908, b:0x1fc1eae44e0

pay attention to the references. they are circular referenced.

NOTE: in order to see those references you have to disable the garbage collector otherwise it will delete them automatically.

  gc.disable()

Currently reference count of my_var which is (0x1fc1eae4908) is 2. my_var and classB are referencing to this address. if we change the my_var

   my_var= None

now my_var is not pointing the same memory address. Now reference count of (0x1fc1eae4908) is 1 therefore this memory slot is not cleaned up.

now we will have Memory Leak which is when the memory is no longer needed is not cleaned up.

Garbage Collector will automatically identify the memory leak in circular references and clean them up. But if even one of the objects in the circular reference has a destructor (del()), Garbage Collector does not know the destruction order of the objects. So the object is marked as uncollectable and objects in the circular reference are not cleaned up which leads to memory leak.

weakref is used for caching purposes. I think python has very good documentation.

here is the reference for the weakref:

weakref documentation

Upvotes: 2

Artyer
Artyer

Reputation: 40811

You do not have to worry about reference cycles if the objects in the cycle don't have a custom __del__ method, as Python can (and will) destroy the objects in any order.

If your custom methods do have a __del__ method, Python doesn't know if one objects deletion effects another objects deletion at all. Say when an object is deleted, it sets some global variable. So, the objects stick around. As a quick test, you can make a __del__ method that prints something:

class Deletor(str):
    def __del__(self):
        print(self, 'has been deleted')

a = Deletor('a')  # refcount: 1
del a             # refcount: 0

Outputs:

a has been deleted

But if you had code like this:

a = Deletor('a')  # refcount: 1
a.circular = a    # refcount: 2
del a             # refcount: 1

It outputs nothing, as Python can't safely delete a. It becomes an "uncollectable object", and can be found in gc.garbage

There are two solutions to this. The weakref (which doesn't increase the reference count):

# refcount:             a  b
a = Deletor('a')      # 1  0
b = Deletor('b')      # 1  1
b.a = a               # 2  1
a.b = weakref.ref(b)  # 2  1
del a                 # 1  1
del b                 # 1  0
# del b kills b.a     # 0  0

Outputs:

b has been deleted
a has been deleted

(Note how b is deleted first before it can delete a)

And you can manually delete the cycles (If you can keep track of them):

# refcount          a  b
a = Deletor('a')  # 1  0
b = Deletor('b')  # 1  1
b.a = a           # 2  1
a.b = b           # 2  2
del b             # 2  1
print('del b')
del a.b           # 2  0
# b is deleted, now dead
# b.a now dead    # 1  0
print('del a.b')
del a             # 0  0
print('del a')

Outputs:

del b
b has been deleted
del a.b
a has been deleted
del a

Notice how b is deleted after a.b is deleted.


† Starting with Python 3.4, things changed due to PEP 442. __del__ may be called even on objects in a reference cycle and the semantics are slightly different, so it is slightly harder to become an uncollectable object.

A weakref is still helpful, as it is less intense on the garbage collector, and memory can be reclaimed earlier.

Upvotes: 6

VPfB
VPfB

Reputation: 17247

(Trying to answer why do we then have the weak references subquestion.)

Weakrefs do not only break circular references, but also prevent unwanted non-circular references.

My favourite example is counting simultaneous network connections (kind of load measuring) using a WeakSet. In this example every new connection has to be added to the WeakSet, but that's the only task the networking code needs to do. The connection can be closed by the server, by the client, or in an error handler, but none of these routines has the responsibility to remove the connection from the set, and that is because the additional references are weak.

Upvotes: 2

Related Questions