Jing
Jing

Reputation: 945

Numpy Slicing slow?

Hi I am running scientific computing using numpy + numba. I've realized that numpy array addition in-place is very slow... compared to matlab

here is the matlab code:

tic;
% A,B are 2-d matrices, ind may not be distinct
for ii=1:N 
    A(ind(ii),:) =  A(ind(ii),:) +  B(ii,:);
end
toc;

and here is the numpy code:

s = time.time()
# A,B are numpy.ndarray, ind may not be distinct
for k in xrange(N):
     A[ind[k],:] += B[k,:];
print time.time() - s

The result shows that numpy code is 10x slower than matlab... which confuses me a lot.

Moreover, when I pull the addition out of for loop, and just compare a single matrix addition with numpy.add, numpy and matlab seem to be comparable at speed.

One factor I know is that matlab uses JIT for version>=2012a to speed up for loop, but I tried numba on python code, it still does not speed up even a bit. I think this has to do with that numba has not touched numpy.add function at all, hence the performance does not change at all.

I am guessing that matlab does some sick caching for this case, hence it beats numpy dramatically.

Any suggestion on how to speed up numpy ?

Upvotes: 4

Views: 2079

Answers (2)

YXD
YXD

Reputation: 32521

Try

A[ind] += B[:N]

i.e. without any loop.

If ind could have duplicate elements, you can use np.add.at:

np.add.at(A, ind, B[:N])

Upvotes: 3

hpaulj
hpaulj

Reputation: 231690

Here'a version that uses dot matrix multiplication. It constructs a matrix of 1s and 0s from ind.

def bar(A,B,ind):
    K,M =B.shape
    N,M =A.shape
    I = np.zeros((N,K))
    I[ind,np.arange(K)] = 1
    return A+np.dot(I,B)

For a problem with sizes like K,M,N = 30,14,15 this is about 3x faster. But for larger ones like K,M,N = 300,100,150 it's a bit slower.

Upvotes: 0

Related Questions