Reputation: 1001
I have ported a matlab piece of code to python and faced problems with efficiency.
For instance, here comes a snippet :
G = np.vstack((Gx.toarray(), Gy.toarray(), Gd1.toarray(), Gd2.toarray()))
Here all elements to be stacked are 22500 by 22500 sparce matrices. It dies directly on my Windows 64 bit machine with following error :
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
MemoryError
I'm quite new to Python, is there any good article on best practices for such optimization? Any information on how numpy works with memory?
As far as I know sparce matrices stored in some kind of compressed format and take much less space then but take much more time to work with.
Thx!
Upvotes: 1
Views: 74
Reputation: 221574
For stacking sparse matrices, you can use Scipy sparse's vstack function instead of NumPy's vstack
one, like so -
import scipy.sparse as sp
Gout = sp.vstack((Gx,Gy,Gd1,Gd2))
Sample run -
In [364]: # Generate random sparse matrices
...: Gx = sp.coo_matrix(3*(np.random.rand(10,10)>0.7).astype(int))
...: Gy = sp.coo_matrix(4*(np.random.rand(10,10)>0.7).astype(int))
...: Gd1 = sp.coo_matrix(5*(np.random.rand(10,10)>0.7).astype(int))
...: Gd2 = sp.coo_matrix(6*(np.random.rand(10,10)>0.7).astype(int))
...:
In [365]: # Run original and proposed approaches
...: G = np.vstack((Gx.toarray(), Gy.toarray(), Gd1.toarray(), Gd2.toarray()))
...: Gout = sp.vstack((Gx,Gy,Gd1,Gd2))
...:
In [366]: # Finally verify results
...: np.allclose(G,Gout.toarray())
Out[366]: True
Upvotes: 1