Brad Solomon
Brad Solomon

Reputation: 40878

Define array of strings in Cython

Stuck on some basic Cython here - what's a canonical and efficient way to define an an array of strings in Cython? Specifically, I want to define a fixed-length constant array of char. (Please note that I would prefer not to bring in NumPy at this point.)

In C this would be:

/* cletters.c */
#include <stdio.h>

int main(void)
{
    const char *headers[3] = {"to", "from", "sender"};
    int i;
    for (i = 0; i < 3; i++)
        printf("%s\n", headers[i]);
}

Attempt in Cython:

# cython: language_level=3
# letters.pyx

cpdef main():
    cdef const char *headers[3] = {"to", "from", "sender"}
    print(headers)

However, this gives:

(cy) $ python3 ./setup.py build_ext --inplace --quiet
cpdef main():
    cdef const char *headers[3] = {"to", "from", "sender"}
                               ^
------------------------------------------------------------

letters.pyx:5:32: Syntax error in C variable declaration

Upvotes: 5

Views: 2157

Answers (2)

Dev Aggarwal
Dev Aggarwal

Reputation: 8516

For python3 Unicode strings, this is possible-

cdef Py_UNICODE* x[2] 
x = ["hello", "worlᏪd"]

or

cdef Py_UNICODE** x
x = ["hello", "worlᏪd"]

Upvotes: 3

ead
ead

Reputation: 34326

You need two lines:

%%cython
cpdef main():
    cdef const char *headers[3] 
    headers[:] = ['to','from','sender`]       
    print(headers)

Somewhat counterintuitive is than one assigns unicode-strings (Python3!) to char*. That is one of Cython's quirks. On the other hand, while initializing everything with only one value, bytes-object is needed:

%%cython
cpdef main():
    cdef const char *headers[3] 
    headers[:] = b'init_value`  ## unicode-string 'init_value' doesn't work.     
    print(headers)

Another alternative is the following oneliner:

%%cython
cpdef main():
    cdef const char **headers=['to','from','sender`]

    print(headers[0], headers[1], headers[2])

which is not exactly the same as above and leads to the following C-code:

  char const **__pyx_v_headers;
  ...
  char const *__pyx_t_1[3];
  ...
  __pyx_t_1[0] = ((char const *)"to");
  __pyx_t_1[1] = ((char const *)"from");
  __pyx_t_1[2] = ((char const *)"sender");
  __pyx_v_headers = __pyx_t_1;

__pyx_v_headers is of type char ** and downside is, that print(headers)no longer works out of the box.

Upvotes: 5

Related Questions