CircArgs
CircArgs

Reputation: 650

Cython "Memory Efficient Doubly Linked List"

As an exercise, I am putting together a doubly-linked list using XOR on pointers to facilitate only having to store a single value per node. I'm trying to use Cython which is new to me and I drew up the following pattern quickly without knowing if the features of cython would work as expected

cdef class Node:
    def __init__(self, object val, void* prev_xor_next=NULL):
        self.prev_xor_next=prev_xor_next
        self.val=val

    def __repr__(self):
        return str(self.val)

cdef class CurrentNode(Node):
    def __init__(self, Node node, void* prev_ptr=NULL):
        self.node=node
        self.prev_ptr=prev_ptr

    cpdef CurrentNode forward(self):
        if self.node.prev_xor_next:
            return CurrentNode((self.node.prev_xor_next^self.prev_ptr)[0], &self.node)

    cpdef CurrentNode backward(self):
        if self.prev_ptr:
            return CurrentNode(self.prev_ptr[0], (&self.node)^(self.prev_ptr[0]).prev_xor_next)

cdef class XORList:
    cdef Node first, last, current

    cpdef append(self, object val):
        if not first:
            self.first=CurrentNode(Node(val))
            self.last=self.first
            self.current=self.first
        else:
            if last==first:
                self.last=CurrentNode(Node(val))
                self.first.node.prev_xor_next=(&self.last.node)^self.first.prev_ptr
                self.first=self.first.node
                self.last.prev_ptr=&self.first
            else:
                temp=CurrentNode(Node(val))
                self.last.node.prev_xor_next=(&self.temp.node)^self.last.prev_ptr
                self.temp.prev_ptr=&self.last
                self.last=temp

    cpdef reverse(self):
        temp=self.last
        self.last=self.first
        self.first=temp

    def __iter__(self):
        return self

    def __next__(self):
        if not self.current or not self.current.forward():
            self.current=CurrentNode(self.first)
            raise StopIteration()
        ret = self.current
        self.current=self.current.forward()
        return ret

    def __repr__(self):
        cdef str ret =''
        for i in self:
            ret+='<=>'+str(i)
        return ret

The problems I've faced include getting pointers to extension types and passing pointers to constructors. The second of which I've found solved by using a static factory method but I've left that out here to hopefully keep the pattern I'm trying clearer.

I've tried replacing Node with a struct yet this leaves me unable to take a python object as the Node's value.

I've looked into PyObject and PyCapsule, but I've had no luck with these either which could simply be due to a lack of understanding of their usage although their base purposes do seem clear.

Is there an approach that allows this pattern using Cython?

Please forgive any logical errors in the code. I haven't tested it and it's just an exercise.

Upvotes: 0

Views: 251

Answers (1)

DavidW
DavidW

Reputation: 30929

You can cast extension types to-and-from pointers using the angle-bracket cast syntax. It's often worth using the type PyObject* which you can cimport from cpython.object however you can also go directly to void*:

from cpython.object cimport PyObject

cdef Node n = Node()
cdef PyObject* nptr1 = <PyObject*>n
cdef void* nptr2 = <void*>n
cdef void* nptr3 = <void*>nptr1

To return back to the cdef type:

cdef Node new_node = <Node>nptr1

The one thing Cython won't let you is:

# ERROR: Storing unsafe C derivative of temporary Python reference
cdef PyObject* bad = <PyObject*>Node()

It realises that the new Node will cease to exist almost as soon as it is made, so the pointer is instantly invalid. In contrast nptr1, nptr2, and nptr3 are valid for at least as long as n isn't reassigned.


Note that you must handle the reference counting for these yourself. You can access the relevant function with cimport again: from cpython.ref cimport Py_XINCREF, Py_XDECREF. The functions take PyObject*, hence why it's useful to use it. If you intend to store one of these pointers then you need to increase the reference count, and you should decrease the reference count when you're done with it.

I don't know exactly what reference counting should be done for your xor-ed list, but you will need to do it!

There's potentially a more subtle reference counting problem here too: if you have circular references (for example if Node contains a link to XOrList) then it'll never get freed. Cython attempts to generate functions that handle circular references for cdef classes however it doesn't (and can't) understand what you're doing with pointers so they will be excluded; it isn't easy to override this behaviour.

Upvotes: 3

Related Questions