glep
glep

Reputation: 92

Matlab allocates a sparse matrix more memory than is required

Suppose I create this sparse matrix, where the non-zero elements consist of booleans 'true':

s = sparse([3 2 3 3 3 3 2 34 3 6 3 2 3 3 3 3 2 3 3 6], [10235 11470 21211 33322 49297 88361 91470 127422 152383 158751 166485 171471 181211 193321 205548 244609 251470 283673 312384 318752], true);

which contains 20 elements. Matlab ought to allocates no more than (4+4+1)*20 = 180 bytes of memory (it looks like the indices are 4 bytes long). Yet

whos s

says that the matrix takes up 1275112 bytes in memory, which is a problem as I need to store many thousands of these.

Any idea why this happens?

Cheers!

Upvotes: 1

Views: 489

Answers (2)

Strategy Thinker
Strategy Thinker

Reputation: 363

I do not know why this happens but I do know how to fix it. Here is some code revealing that a sparse matrix is linearly dependent on the number of columns.

for k = 1:6
  n = 10^k; 
  a = sparse(n, 100); % keep number of columns constant
  tmp = whos('a'); 
  fprintf('%1.0f bytes used\n', tmp.bytes); 
end

which produces

416 bytes used
416 bytes used
416 bytes used
416 bytes used
416 bytes used
416 bytes used

while keeping the number of rows constant with a = sparse(n, 100); instead gives

     56 bytes used
    416 bytes used
   4016 bytes used
  40016 bytes used
 400016 bytes used
4000016 bytes used

thus to optimize your code change s from a 34 x 318752 to a 318752 x 34 matrix by swapping the first two inputs of sparse.

Upvotes: 1

Peter
Peter

Reputation: 14937

The memory storage format of a sparse matrix in MATLAB is a dense array of column pointers. Each column pointer points to a list of nonzero elements, and each element needs an index and a value. So the formula is

(max column num) x P + (num nonzero) x (P + S)

where P is pointer size (8 bytes on a 64-bit system, 4 on a 32-bit system), and S is the size of a single element. 1 for logical. For your problem I get 1275108, or "close enough".

So what to do about it? Note the big memory driver: maximum column number, due to the dense array of column pointers. In your case, if you reverse the index order and store the transpose of the matrix, it takes only 236 bytes (on your 32-bit system).

Upvotes: 5

Related Questions