Cython Typing List of Strings

Question

I'm trying to use cython to improve the performance of a loop, but I'm running into some issues declaring the types of the inputs.

How do I include a field in my typed struct which is a string that can be either 'front' or 'back'

I have a np.recarray that looks like the following (note the length of the recarray is unknown as compile time)

import numpy as np
weights = np.recarray(4, dtype=[('a', np.int64),  ('b', np.str_, 5), ('c', np.float64)])
weights[0] = (0, "front", 0.5)
weights[1] = (0, "back", 0.5)
weights[2] = (1, "front", 1.0)
weights[3] = (1, "back", 0.0)

as well as inputs of a list of strings and a pandas.Timestamp

import pandas as pd
ts = pd.Timestamp("2015-01-01")
contracts = ["CLX16", "CLZ16"]

I am trying to cythonize the following loop

def ploop(weights, contracts, timestamp):
    cwts = []
    for gen_num, position, weighting in weights:
        if weighting != 0:
            if position == "front":
                cntrct_idx = gen_num
            elif position == "back":
                cntrct_idx = gen_num + 1
            else:
                raise ValueError("transition.columns must contain "
                                 "'front' or 'back'")
            cwts.append((gen_num, contracts[cntrct_idx], weighting, timestamp))
    return cwts

My attempt involved typing the weights input as a struct in cython, in a file struct_test.pyx as follows

import numpy as np
cimport numpy as np


cdef packed struct tstruct:
    np.int64_t gen_num
    char[5] position
    np.float64_t weighting


def cloop(tstruct[:] weights_array, contracts, timestamp):
    cdef tstruct weights
    cdef int i
    cdef int cntrct_idx

    cwts = []
    for k in xrange(len(weights_array)):
        w = weights_array[k]
        if w.weighting != 0:
            if w.position == "front":
                cntrct_idx = w.gen_num
            elif w.position == "back":
                cntrct_idx = w.gen_num + 1
            else:
                raise ValueError("transition.columns must contain "
                                 "'front' or 'back'")
            cwts.append((w.gen_num, contracts[cntrct_idx], w.weighting,
                         timestamp))
    return cwts

But I am receiving runtime errors, which I believe are related to the char[5] position.

import pyximport
pyximport.install()
import struct_test

struct_test.cloop(weights, contracts, ts)

ValueError: Does not understand character buffer dtype format string ('w')

In addition I am a bit unclear how I would go about typing contracts as well as timestamp.

Cython Typing List of Strings

Answers (1)

Related Questions