Reputation: 2823
I'm doing some experimentation with cython and I came across some unexpected behavior:
In [1]: %load_ext cython
In [2]: %%cython
...: cdef class foo(object):
...: cdef public char* val
...: def __init__(self, char* val):
...: self.val = val
...:
In [3]: f = foo('aaa')
In [4]: f.val
Out[4]: '\x01'
What's going on with f.val
? repeated inspection produces seemingly random output, so it looks like f.val
is pointing to invalid memory.
The answer to this question suggests that you should use str
instead.
Indeed, this version works fine:
In [21]: %%cython
...: cdef class foo(object):
...: cdef public str val
...: def __init__(self, str val):
...: self.val = val
So, what is going on in the first version? It seems like the char*
is getting freed at some point after class construction but I'm not really clear on why.
Upvotes: 3
Views: 399
Reputation: 280251
When you convert a Python bytestring to a char *
in Cython, Cython gives you a pointer to the contents of the string object. This raw pointer does not affect the string's Python refcount (it'd be infeasible to track which pointers refer to which strings).
When the string's refcount hits zero and the string is reclaimed, your pointer becomes invalid.
You shouldn't convert Python bytestrings to char *
s unless you actually need a char *
. If you do, make sure to also keep a normal Python reference to the string for as long as you need the char *
, to extend the string's lifetime.
Upvotes: 3