jmd_dk
jmd_dk

Reputation: 13090

Usage of Cython directive no_gc

In Cython 0.25 the no_gc directive was added. The documentation for this new directive (as well as for a related no_gc_clear directive) can be found here, but the only thing I really understand about it is that it can speed up your code be disabling certain aspects of garbage collection.

I am interested because I have some high performance Cython code which uses extension types, and I understand that no_gc can speed things up further. In my code, instances of extension types are always left alive until the very end when the program closes, which makes me think that disabling garbage collection for these might be OK.

I guess what I really need is an example where the usage of no_gc goes bad and leads to memory leaks, together with en explanation of exactly why that happens.

Upvotes: 2

Views: 1085

Answers (1)

DavidW
DavidW

Reputation: 30890

It's to do with circular references - when instance a holds a reference to a Python object that references a again then a can never be freed through reference counting so Python tries to detect the cycle.

A very trial example of a class that could cause issues is:

# Cython code:

cdef class A:
    cdef param

    def __init__(self):
        self.param = self

(and some Python code to run it)

import cython_module
while True:
    cython_module.A()

This is fine as is (the cycles are detected and they get deallocated every so often) but if you add no_gc then you will run out of memory.

A more realistic example might be a parent/child pair that store a reference to each other.


It's worth adding that the performance gains likely to be small. The garbage collector is only run occasionally in situations when a lot of objects have been allocated and few have been freed (https://docs.python.org/3/library/gc.html - see set_threshold). It's hopefully unlikely that this describes your high performance code.

There's probably also a small performance cost on allocation and deallocation of your objects with GC, to add/remove them from the list of tracked objects (but again, hopefully you aren't allocating/deallocting huge numbers)


Finally, if your class never stores any references to Python objects then it's effectively no_gc anyway. Setting the option will do no harm but also do no good.

Upvotes: 2

Related Questions