kaiyan711
kaiyan711

Reputation: 181

Cython - Define 2d arrays

here is the cython code i am trying to optimize,

    import cython
    cimport cython
    from libc.stdlib cimport rand, srand, RAND_MAX
    import numpy as np
    cimport numpy as np

    def genLoans(int loanid):
        cdef int i, j, k
        cdef double[:,:,:] loans = np.zeros((240, 20, 1000))
        cdef double[:,:] aggloan = np.zeros((240, 20))
        for j from 0<=j<1000:
            srand(loanid*1000+j)
            for i from 0<=i<240:
                for k from 0<=k<20:
                    loans[i,k,j] = rand()
                    ###some other logics
            aggloan[i,k] += loans[i,k,j]/1000
        return aggloan

cython -a shows

enter image description here I guess when I trying to initialize zero array loans and aggloan, numpy slows me down. Yet i need to run 5000+ loans. Just wondering if there is other ways to avoid using numpy when i define 3d/2d and return arrays...

Upvotes: 4

Views: 2747

Answers (1)

Davidmh
Davidmh

Reputation: 3865

The yellow part is because of the Numpy call, where you allocate the array. What you can do is pass these arrays as arguments to the function, and reuse them from one to the next.

Also, I see you are rewriting all the elements, so you are claiming memory, writing it with zeroes, and then putting in your numbers. If you are sure you are overwriting all the elements, you can use np.empty, that will not initialize the variables.

Note: Linux kernel has a specific way of allocating memory initialised to 0, that is faster that any other value, and modern Numpy can use it, but it is still slower than empty:

In [4]: %timeit np.zeros((100,100))
100000 loops, best of 3: 4.04 µs per loop

In [5]: %timeit np.ones((100,100))
100000 loops, best of 3: 8.99 µs per loop

In [6]: %timeit np.empty((100,100))
1000000 loops, best of 3: 917 ns per loop

Last but not least, are you sure this is your bottleneck? I don't know what processing are you doing, but yellow is the number of lines of C code, not time. Anyway, from the timings, using empty should speed up that by a factor of four. If you want more, post the rest of your code at CR.

Edit:

Expanding on my second sentence: your function signature can be

def genLoans(int loanid, cdef double[:,:,:] loans,  cdef double[:,:] aggloan):

You initialize the arrays before your loop, and just pass them again and again.

In any case, in my machine (Linux Intel i5), it takes 9µs, so you are spending a total of 45 ms. This is definitely not your bottleneck. Profile!

Upvotes: 4

Related Questions