Reputation: 71
How can one declare an array of arrays in cython?
More precisely, I want to construct (declare and then initialize) an m by n matrix, call it A, in which each entry [i,j] is a 1-dimensional array of doubles (of length min(i,j)
, filled with zeros) of the form
cdef np.ndarray[np.double_t, ndim=1] A[i,j]
A[i,j] = np.zeros((min(i,j)), dtype=np.double)
For (m,n)=(4,3), print A
should return something like this:
[[[], [], []],
[[], [0.], [0.]],
[[], [0.], [0.,0.]],
[[], [0.], [0.,0.]]]
How do I declare and initialize A?
Upvotes: 6
Views: 2691
Reputation: 60227
The object method:
import numpy
def thing(int m, int n):
cdef int i, j
cdef object[:, :] A = numpy.empty((m, n), dtype=object)
for i in range(A.shape[0]):
for j in range(A.shape[1]):
A[i, j] = numpy.zeros(min(i, j))
return A
Note that that object[:, :]
syntax is the newer version, the numpy.ndarray[object, ndim=2]
version is deprecated. The newer version is GIL-free (well, probably not when using object
types), normally faster (never slower), type-agnostic (works on anything supporting memoryview
) and cleaner.
If you want to iterate over the sub-arrays, you'd be doing:
for i in range(A.shape[0]):
for j in range(A.shape[1]):
subarray = A[i, j]
for k in range(subarray.size):
...
and you can either type subarray
to object
(best for small subarray
s) or float[:]
(best for large subarray
s).
The C-level solution is proving to be very tricky. I have a feeling you'll basically end up writing it with pure-C types.
So I'm abandoning that, and here's what I'd do:
import numpy
def thing(int m, int n):
cdef int i, j
cdef float[:, :, :] A = numpy.zeros((m, n, min(m, n)), dtype=float)
cdef int[:, :] A_lengths = numpy.empty((m, n), dtype=int)
for i in range(A_lengths.shape[0]):
for j in range(A_lengths.shape[1]):
A_lengths[i, j] = min(i, j)
return A, A_lengths
Basically, make a 3D array and a 2D array of corresponding lengths. If there is only a linear variation in lengths (so the max length is a reasonable factor [I'd say up to about 10] of the mean length) then this should have acceptable overhead. It'll allow pure-C calculations while having a tasty memoryview interface.
That's all I've got. Take it or leave it.
Upvotes: 7