GlacierSG
GlacierSG

Reputation: 469

Numpy zeros in array

I'm trying to get maximum performance out of numpy and was wondering if there was a better way to calculate the dot product with an array that has a lot of zeros in it for example:

a = np.array([[0, 3, 0], [1, 0, 1]])
print a.dot([1, 2, 5])

this is a small example but if we had a much larger scale array that has lets say 80% zeros at any place in the array, my question would be is there a better or preferably faster way to calculate the dot product when there are so many zeros?

Upvotes: 1

Views: 984

Answers (1)

hpaulj
hpaulj

Reputation: 231385

In [269]: from scipy import sparse
In [270]: M=sparse.random(1000,1000,.1, 'csr')
In [271]: MA = M.A
In [272]: timeit M*M.T
10 loops, best of 3: 64 ms per loop
In [273]: timeit [email protected]
10 loops, best of 3: 60.4 ms per loop

I defined a random sparse matrix with a specified sparsity, 10%:

In [274]: M
Out[274]: 
<1000x1000 sparse matrix of type '<class 'numpy.float64'>'
    with 100000 stored elements in Compressed Sparse Row format>
In [275]: np.allclose([email protected], (M*M.T).A)
Out[275]: True

@ is an operator form of dot (see np.matmul). So at this 10% level of sparsity, the two approaches time the same (without any conversion to/from sparse).

For this random matrix, the M*M.T result is dense:

In [282]: (M*M.T)
Out[282]: 
<1000x1000 sparse matrix of type '<class 'numpy.float64'>'
    with 999964 stored elements in Compressed Sparse Row format>

The sparse times depend heavily on sparsity; the dense times not at all

In [295]: M=sparse.random(1000,1000,.01, 'csr'); MA=M.A
In [296]: timeit M*M.T
100 loops, best of 3: 2.44 ms per loop
In [297]: timeit [email protected]
10 loops, best of 3: 56.3 ms per loop
In [298]: M=sparse.random(1000,1000,.2, 'csr'); MA=M.A
In [299]: timeit M*M.T
10 loops, best of 3: 175 ms per loop
In [300]: timeit [email protected]
10 loops, best of 3: 56.3 ms per loop

With a round trip to sparse and back, times jump from 60 to 100ms

In [302]: %%timeit
     ...: M1=sparse.csr_matrix(MA)
     ...: (M1*M1.T).A
     ...: 
10 loops, best of 3: 104 ms per loop

Upvotes: 1

Related Questions