Stephen
Stephen

Reputation: 2823

specifying string types in cython code

I'm doing some experimentation with cython and I came across some unexpected behavior:

In [1]: %load_ext cython

In [2]: %%cython
   ...: cdef class foo(object):
   ...:     cdef public char* val
   ...:     def __init__(self, char* val):
   ...:         self.val = val
   ...:

In [3]: f = foo('aaa')

In [4]: f.val
Out[4]: '\x01'

What's going on with f.val? repeated inspection produces seemingly random output, so it looks like f.val is pointing to invalid memory.

The answer to this question suggests that you should use str instead. Indeed, this version works fine:

In [21]: %%cython
    ...: cdef class foo(object):
    ...:     cdef public str val
    ...:     def __init__(self, str val):
    ...:         self.val = val

So, what is going on in the first version? It seems like the char* is getting freed at some point after class construction but I'm not really clear on why.

Upvotes: 3

Views: 399

Answers (1)

user2357112
user2357112

Reputation: 280251

When you convert a Python bytestring to a char * in Cython, Cython gives you a pointer to the contents of the string object. This raw pointer does not affect the string's Python refcount (it'd be infeasible to track which pointers refer to which strings).

When the string's refcount hits zero and the string is reclaimed, your pointer becomes invalid.

You shouldn't convert Python bytestrings to char *s unless you actually need a char *. If you do, make sure to also keep a normal Python reference to the string for as long as you need the char *, to extend the string's lifetime.

Upvotes: 3

Related Questions