fsociety
fsociety

Reputation: 1850

Memory efficient storage of many large scipy sparse matrices

I need to store around 50.000 scipy sparse csr matrices where each matrix is a vector of length 3.7Million:

x = scipy.sparse.csr_matrix((3.7Mill,1))

I currently store them into a simple dictionary, because I also need to know the corresponding key for each vector (in this case the key is just a simple integer).

The problem now is the huge amount of memory needed. Are there some more efficient ways?

Upvotes: 3

Views: 1319

Answers (1)

ymn
ymn

Reputation: 2153

Try to use Lazy data structures.

For example:

def lazy(func):
    def lazyfunc(*args, **kwargs):
        temp = lambda x : func(*args, **kwargs)
        temp.__name__ = "lazy-" + func.__name__
        return temp
    return lazyfunc

"""
Add some simple functions
"""
def add(x, y):
    print "Not lazy"
    return x + y

@lazy
def add_lazy(x, y):
    print "lazy!"
    return x + y

Usage:

>>> add(1, 2)
Not lazy
3
$ add_lazy(1, 2)
<function lazy-add_lazy at 0x021E9470>
>>> myval = add_lazy(1, 2)
>>> myval()
lazy!
3

Look at:

Upvotes: 3

Related Questions