Numpy Slicing slow?

Question

Hi I am running scientific computing using numpy + numba. I've realized that numpy array addition in-place is very slow... compared to matlab

here is the matlab code:

tic;
% A,B are 2-d matrices, ind may not be distinct
for ii=1:N 
    A(ind(ii),:) =  A(ind(ii),:) +  B(ii,:);
end
toc;

and here is the numpy code:

s = time.time()
# A,B are numpy.ndarray, ind may not be distinct
for k in xrange(N):
     A[ind[k],:] += B[k,:];
print time.time() - s

The result shows that numpy code is 10x slower than matlab... which confuses me a lot.

Moreover, when I pull the addition out of for loop, and just compare a single matrix addition with numpy.add, numpy and matlab seem to be comparable at speed.

One factor I know is that matlab uses JIT for version>=2012a to speed up for loop, but I tried numba on python code, it still does not speed up even a bit. I think this has to do with that numba has not touched numpy.add function at all, hence the performance does not change at all.

I am guessing that matlab does some sick caching for this case, hence it beats numpy dramatically.

Any suggestion on how to speed up numpy ?

YXD · Accepted Answer

Try

A[ind] += B[:N]

i.e. without any loop.

If ind could have duplicate elements, you can use np.add.at:

np.add.at(A, ind, B[:N])

Numpy Slicing slow?

Answers (2)

Related Questions