Out of memory error when multiplying two very large sparse matrices in MATLAB

Question

I have a very large sparse tall matrixA of size (M x N , M >>>>> N) and other matrixB of size N x 1

I want to do

MatrixA*MatrixB

and get M x 1 output. But I get an out of memory error. The matrices only have a couple of 100 nonzero elements. What's the way around this?

Andras Deak -- Слава Україні · Accepted Answer

As @patrik noted in a now-deleted answer, the problem is that MATLAB stores sparse matrices column-wise rather than elementwise. This means that empty columns of your sparse matrix take up a tiny amount of memory, and a huge number of empty columns takes up an excessive amount of memory, even when the overall nonzero elements are reasonably few. A corresponding quote from one of your comments:

whos for one of the variables is MatrixA 31679201751184x290953480 double sparse

I believe my saying "huge" is a bit of an understatement.

Since you've noted in comments that your matrices, especially the second one, contain very few nonzero elements, it's possibly quicker, but definitely feasible and more memory-efficient to implement the matrix product yourself.

Pick the nonzero indices of your matrices using find, then loop over these nonzeros of your matrices to construct the matrix product. The only thing you need is that (with some math-based pseudocode)

[MatrixA * MatrixB](m,l) = sum_k MatrixA(m,k)*MatrixB(k,l)

The indices you need for this all come from the vectors

[kvec, lvec] = find(MatrixB);
[mvec, ~] = find(MatrixA);

which especially in your case will be very light on the lvec side. After some preallocation it should be straightforward to loop over mvec and lvec, summing up the necessary terms with respect to elements of kvec along the way.

If your matrixB is indeed just a column vector, then your job is even easier: then you need

[MatrixA * MatrixB](m,1) = sum_k MatrixA(m,k)*MatrixB(k,1)

for which you essentially need

k1vec = find(MatrixB);
[mvec, k2vec] = find(MatrixA);

%// choose those 'k' indices which are nonzero in both matrices
both_k = intersect(k1vec,k2vec);
inds_A = ismember(k2vec,both_k);
inds_B = ismember(k1vec,both_k);
mvec = mvec(inds_A);
k2vec = k2vec(inds_A);
k1vec = k1vec(inds_B);

then constructing the product vector after preallocation.

Out of memory error when multiplying two very large sparse matrices in MATLAB

Answers (2)

Related Questions