How to deal with a giant sparse matrices?

Question

Someone point me in the right direction. I'm looking to do some heavy-duty manipulation of some really large and often very sparse matrices and I'm looking for the right tool for the job. These matrices will be much, much larger than the RAM of any single machine and will therefore likely be spread to several different machines. The matrices will often be sparse. I will want to perform all of the common matrix operations: multiplication, transpose, inverse, pseudo-inverse, SVD, Eigenvalue Decomposition, etc. Probably key among my concerns is that since the matrices will very likely be spread among several machines, I will want to minimize information sharing, because network latency is probably my biggest enemy. I'm concerned that map-reduce (a la Hadoop) is not the right option because it's focus is upon streaming large amounts of data between machines. This book provides a great intro to map-reduce from an algorithmic perspective. And lots of matrix operations are akin to giant JOIN operations which are known to be slow or map-reduce.

So... where should I go?

How to deal with a giant sparse matrices?

Answers (1)

Related Questions