uday
uday

Reputation: 6713

writing numpy codes in cython with unknown dimensions

Suppose I have a Cython code, with functions that calculates a rolling moving average and returns an array of the same size as the input (the function adds nan for the initial part, but this is not important for the problem at hand).

I have written three Cython functions (given below):

(a) sma_vec to handle numpy arrays of 1-dimension

(b) sma_mat to handle numpy arrays of 2-dimension

(c) a third sma to return values from either sma_vec or sma_mat depending upon the size. (My motivation is to eventually replace the cpdef before sma_vec or sma_mat to cdef so that the Python code only sees the sma function)

Function 1 - handles numpy arrays of 1-dimension

cimport cython
import numpy as np
cimport numpy as np
from numpy cimport ndarray as ar
ctypedef double dtype_t

@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma_vec(ar[dtype_t, ndim=1] x, int m):
    cdef int n
    cdef Py_ssize_t i, j
    cdef ar[dtype_t, ndim=1] y
    if m == 1:
        return x.copy()
    else:
        y = np.zeros_like(x) * np.nan
        n = x.shape[0]
        if n < m:
            return y
        else:
            for i in range(m-1, n):
                for j in range(i-m+1, i+1):
                    if j == i-m+1:
                        y[i] = x[j]
                    else:
                        y[i] += x[j]
                y[i] /= float(m)
            return y

Function 2 - handles numpy arrays of 2-dimension (calls Function 1 on each row of ndarray)

@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma_mat(ar[dtype_t, ndim=2] x, int m):
    cdef int n
    cdef Py_ssize_t i
    cdef ar[dtype_t, ndim=2] y
    if m == 1:
        return x.copy()
    else:
        y = np.zeros_like(x) * np.nan
        n = x.shape[0]
        if n < m:
            return y
        else:
            for i in range(0, x.shape[0]):
                y[i] = sma_vec(x[i], m)
            return y

Function 3-calls Function 1 or Function 2 depending upon dimension

@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma(ar[dtype_t] x, int m):
    if x.ndim == 1:
        return sma_vec(x, m)
    elif x.ndim == 2:
        return sma_mat(x, m)
    else:
        raise ValueError('Cannot handle more than two dimensions')

Test Code

import numpy as np
import common.movavg as mv

x1 = np.array([1.0, 1.4, 1.3, 5.3, 2.3])
y1 = mv.sma_vec(x1, 3)
y1a = mv.sma(x1, 3)

y1 and y1a both return array([nan, nan, 1.233333, 2.666667, 2.966667]) correctly

x2 = np.array([[1.0, 1.4, 1.3, 5.3, 2.3], [4.2, 1.3, 2.3, 5.7, -1.3]])
y2 = mv.sma_mat(x2, 2)

y2 returns correctly

array([[  nan,  1.2 ,  1.35,  3.3 ,  3.8 ],
       [  nan,  2.75,  1.8 ,  4.  ,  2.2 ]])

But when I try:

y2a = mv.sma(x2, 2)

I get an error:

Traceback (most recent call last):
  File "C:\PF\WinPython-64bit-3.4.2.4\python-3.4.2.amd64\lib\site-packages\IPython\core\interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-dc092e343714>", line 3, in <module>
    y2a = mv.sma(x2, 2)
  File "movavg.pyx", line 54, in movavg.sma (stat\movavg.c:2206)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

In the sma function, the issue seems to be that ar[dtype_t] x (i.e. np.ndarray[double] x) automatically assumes x.ndim should have a dimension of 1.

How can I re-write the sma function so that it can accept np.ndarray with unknown dimensions ?

Upvotes: 2

Views: 810

Answers (1)

uday
uday

Reputation: 6713

found the answer.

From this link: numpy_tutorial, "... “ndim” keyword-only argument, if not provided then one-dimensional is assumed ..."

The solution is to convert the Function 3 into:

@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma(ar x, int m):
    if x.ndim == 1:
        return sma_vec(x, m)
    elif x.ndim == 2:
        return sma_mat(x, m)
    else:
        raise ValueError('Cannot handle more than two dimensions')

We need to remove everything in the [] completely.

Upvotes: 2

Related Questions