Reputation: 490
I have a simple question, why this is not efficient:
import numpy as np
cimport numpy as c_np
import cython
def function():
cdef c_np.ndarray[double, ndim=2] A = np.random.random((10,10))
cdef c_np.ndarray[double, ndim=1] slice
slice = A[1,:] #this line is marked as slow by the profiler cython -a
return
How should I slice a numpy matrix in python without overhead. In my code, A is an adjacency matrix, so the slices are the neighbours in my routing algorithm.
Upvotes: 0
Views: 842
Reputation: 91
At the risk of repeating an already good answer (and not answering the question!), I'm going to point out a common misunderstanding with the annotator... Yellow indicates a lot of interacting with the python interpreter. That (and the code you see when expanded) is a really useful hint when optimizating.
However! To quote from the first paragraph in every document on code optimization ever:
Profile first, then optimize. Seriously: don't guess, profile first.
And the annotations are definitely not a profile. Check out the cython docs on profiling and maybe this answer for line profiling How to profile cython functions line-by-line
As an example my code has some bright yellow where it calls some numpy functions on big arrays. There's actually very, very little room there for improvement* as that python interaction overhead is amortized over a lot of computation on those big arrays.
*I might be able to eek out a little by using the numpy C interface directly, but again, amortized over huge computation.
Upvotes: 1
Reputation: 52236
The lines marked by the annotator are only suggestions, not based on actual profiling. I think it uses a relatively simple heuristic, something like number of python api calls. It also does not take into account the number of times something is called - yellow lines inside tight loops are much more important than something called once.
In this case, what you are doing is fairly efficient - one call to numpy to get the sliced array, and the assignment of that array to a buffer.
The generated C code looks like it may be better using the memoryview syntax which is functionally equivalent, but you would have to profile to know for sure if this is actually faster.
%%cython -a
import numpy as np
cimport numpy as c_np
import cython
def function():
cdef double[:, :] A = np.random.random((10,10))
cdef double[:] slice
slice = A[1,:]
return
Upvotes: 2