Reputation: 277
I have a large (n=50000) block diagonal csr_matrix
M representing the adjacency matrices of a set of graphs. I have to have multiply M by a dense numpy.array
v several times. Hence I use M.dot(v)
.
Surprisingly, I have discovered that first converting M to numpy.array
and then using numpy.dot
is much faster.
Any ideas why this it the case?
Upvotes: 0
Views: 2312
Reputation: 15889
I don't have enough memory to hold a 50000x50000
dense matrix in memory and multiply it by a 50000
vector. But find here some tests with lower dimensionality.
Setup:
import numpy as np
from scipy.sparse import csr_matrix
def make_csr(n, N):
rows = np.random.choice(N, n)
cols = np.random.choice(N, n)
data = np.ones(n)
return csr_matrix((data, (rows, cols)), shape=(N,N), dtype=np.float32)
The code above generates sparse matrices with n
non-zero elements in a NxN
matrix.
Matrices:
N = 5000
# Sparse matrices
A = make_csr(10*10, N) # ~100 non-zero
B = make_csr(100*100, N) # ~10000 non-zero
C = make_csr(1000*1000, N) # ~1000000 non-zero
D = make_csr(5000*5000, N) # ~25000000 non-zero
E = csr_matrix(np.random.randn(N,N), dtype=np.float32) # non-sparse
# Numpy dense arrays
An = A.todense()
Bn = B.todense()
Cn = C.todense()
Dn = D.todense()
En = E.todense()
b = np.random.randn(N)
Timings:
>>> %timeit A.dot(b) # 9.63 µs per loop
>>> %timeit An.dot(b) # 41.6 ms per loop
>>> %timeit B.dot(b) # 41.3 µs per loop
>>> %timeit Bn.dot(b) # 41.2 ms per loop
>>> %timeit C.dot(b) # 3.2 ms per loop
>>> %timeit Cn.dot(b) # 41.2 ms per loop
>>> %timeit D.dot(b) # 35.4 ms per loop
>>> %timeit Dn.dot(b) # 43.2 ms per loop
>>> %timeit E.dot(b) # 55.5 ms per loop
>>> %timeit En.dot(b) # 43.4 ms per loop
A
and B
) it is more than 1000x
times faster. C
), it still gets 10x
speedup. D
will have some 0
due to repetition in the indices, but not many probabilistically speaking), it is still faster, not much, but faster.E
), the operation is slower, but not much slower.Conclusion: the speedup you get depends on the sparsity of your matrix, but with N = 5000
sparse matrices are always faster (as long as they have some zero entries).
I can't try it for N = 50000
due to memory issues. You can try the above code and see what is like for you with that N
.
Upvotes: 4