Reputation: 6713
Suppose I have a Cython code, with functions that calculates a rolling moving average and returns an array of the same size as the input (the function adds nan
for the initial part, but this is not important for the problem at hand).
I have written three Cython functions (given below):
(a) sma_vec
to handle numpy
arrays of 1-dimension
(b) sma_mat
to handle numpy
arrays of 2-dimension
(c) a third sma
to return values from either sma_vec
or sma_mat
depending upon the size. (My motivation is to eventually replace the cpdef
before sma_vec
or sma_mat
to cdef
so that the Python code only sees the sma
function)
Function 1 - handles numpy
arrays of 1-dimension
cimport cython
import numpy as np
cimport numpy as np
from numpy cimport ndarray as ar
ctypedef double dtype_t
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma_vec(ar[dtype_t, ndim=1] x, int m):
cdef int n
cdef Py_ssize_t i, j
cdef ar[dtype_t, ndim=1] y
if m == 1:
return x.copy()
else:
y = np.zeros_like(x) * np.nan
n = x.shape[0]
if n < m:
return y
else:
for i in range(m-1, n):
for j in range(i-m+1, i+1):
if j == i-m+1:
y[i] = x[j]
else:
y[i] += x[j]
y[i] /= float(m)
return y
Function 2 - handles numpy
arrays of 2-dimension (calls Function 1 on each row of ndarray
)
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma_mat(ar[dtype_t, ndim=2] x, int m):
cdef int n
cdef Py_ssize_t i
cdef ar[dtype_t, ndim=2] y
if m == 1:
return x.copy()
else:
y = np.zeros_like(x) * np.nan
n = x.shape[0]
if n < m:
return y
else:
for i in range(0, x.shape[0]):
y[i] = sma_vec(x[i], m)
return y
Function 3-calls Function 1 or Function 2 depending upon dimension
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma(ar[dtype_t] x, int m):
if x.ndim == 1:
return sma_vec(x, m)
elif x.ndim == 2:
return sma_mat(x, m)
else:
raise ValueError('Cannot handle more than two dimensions')
Test Code
import numpy as np
import common.movavg as mv
x1 = np.array([1.0, 1.4, 1.3, 5.3, 2.3])
y1 = mv.sma_vec(x1, 3)
y1a = mv.sma(x1, 3)
y1
and y1a
both return array([nan, nan, 1.233333, 2.666667, 2.966667])
correctly
x2 = np.array([[1.0, 1.4, 1.3, 5.3, 2.3], [4.2, 1.3, 2.3, 5.7, -1.3]])
y2 = mv.sma_mat(x2, 2)
y2
returns correctly
array([[ nan, 1.2 , 1.35, 3.3 , 3.8 ],
[ nan, 2.75, 1.8 , 4. , 2.2 ]])
But when I try:
y2a = mv.sma(x2, 2)
I get an error:
Traceback (most recent call last):
File "C:\PF\WinPython-64bit-3.4.2.4\python-3.4.2.amd64\lib\site-packages\IPython\core\interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-dc092e343714>", line 3, in <module>
y2a = mv.sma(x2, 2)
File "movavg.pyx", line 54, in movavg.sma (stat\movavg.c:2206)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
In the sma
function, the issue seems to be that ar[dtype_t] x
(i.e. np.ndarray[double] x
) automatically assumes x.ndim
should have a dimension of 1
.
How can I re-write the sma
function so that it can accept np.ndarray
with unknown dimensions ?
Upvotes: 2
Views: 810
Reputation: 6713
found the answer.
From this link: numpy_tutorial, "... “ndim” keyword-only argument, if not provided then one-dimensional is assumed ..."
The solution is to convert the Function 3 into:
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef sma(ar x, int m):
if x.ndim == 1:
return sma_vec(x, m)
elif x.ndim == 2:
return sma_mat(x, m)
else:
raise ValueError('Cannot handle more than two dimensions')
We need to remove everything in the []
completely.
Upvotes: 2