Csr_matrix.dot vs. Numpy.dot

Question

I have a large (n=50000) block diagonal csr_matrix M representing the adjacency matrices of a set of graphs. I have to have multiply M by a dense numpy.array v several times. Hence I use M.dot(v).

Surprisingly, I have discovered that first converting M to numpy.array and then using numpy.dot is much faster.

Any ideas why this it the case?

Imanol Luengo · Accepted Answer

I don't have enough memory to hold a 50000x50000 dense matrix in memory and multiply it by a 50000 vector. But find here some tests with lower dimensionality.

Setup:

import numpy as np
from scipy.sparse import csr_matrix

def make_csr(n, N):
    rows = np.random.choice(N, n)
    cols = np.random.choice(N, n)
    data = np.ones(n)
    return csr_matrix((data, (rows, cols)), shape=(N,N), dtype=np.float32)

The code above generates sparse matrices with n non-zero elements in a NxN matrix.

Matrices:

N = 5000

# Sparse matrices
A = make_csr(10*10, N)     # ~100 non-zero
B = make_csr(100*100, N)   # ~10000 non-zero
C = make_csr(1000*1000, N) # ~1000000 non-zero
D = make_csr(5000*5000, N) # ~25000000 non-zero
E = csr_matrix(np.random.randn(N,N), dtype=np.float32) # non-sparse

# Numpy dense arrays
An = A.todense()
Bn = B.todense()
Cn = C.todense()
Dn = D.todense()
En = E.todense()

b = np.random.randn(N)

Timings:

>>> %timeit A.dot(b)       # 9.63 µs per loop
>>> %timeit An.dot(b)      # 41.6 ms per loop

>>> %timeit B.dot(b)       # 41.3 µs per loop
>>> %timeit Bn.dot(b)      # 41.2 ms per loop

>>> %timeit C.dot(b)       # 3.2 ms per loop
>>> %timeit Cn.dot(b)      # 41.2 ms per loop

>>> %timeit D.dot(b)       # 35.4 ms per loop
>>> %timeit Dn.dot(b)      # 43.2 ms per loop

>>> %timeit E.dot(b)       # 55.5 ms per loop
>>> %timeit En.dot(b)      # 43.4 ms per loop

For highly sparse matrices (A and B) it is more than 1000x times faster.
For not very sparse matrices (C), it still gets 10x speedup.
For almost non-sparse matrix (D will have some 0 due to repetition in the indices, but not many probabilistically speaking), it is still faster, not much, but faster.
For a truly non-sparse matrix (E), the operation is slower, but not much slower.

Conclusion: the speedup you get depends on the sparsity of your matrix, but with N = 5000 sparse matrices are always faster (as long as they have some zero entries).

I can't try it for N = 50000 due to memory issues. You can try the above code and see what is like for you with that N.

Csr_matrix.dot vs. Numpy.dot

Answers (1)

Related Questions