matlab to python port optimization

Question

I have ported a matlab piece of code to python and faced problems with efficiency.

For instance, here comes a snippet :

G = np.vstack((Gx.toarray(), Gy.toarray(), Gd1.toarray(), Gd2.toarray()))

Here all elements to be stacked are 22500 by 22500 sparce matrices. It dies directly on my Windows 64 bit machine with following error :

return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
MemoryError

I'm quite new to Python, is there any good article on best practices for such optimization? Any information on how numpy works with memory?

As far as I know sparce matrices stored in some kind of compressed format and take much less space then but take much more time to work with.

Thx!

Divakar · Accepted Answer

For stacking sparse matrices, you can use Scipy sparse's vstack function instead of NumPy's vstack one, like so -

import scipy.sparse as sp

Gout = sp.vstack((Gx,Gy,Gd1,Gd2))

Sample run -

In [364]: # Generate random sparse matrices
     ...: Gx = sp.coo_matrix(3*(np.random.rand(10,10)>0.7).astype(int))
     ...: Gy = sp.coo_matrix(4*(np.random.rand(10,10)>0.7).astype(int))
     ...: Gd1 = sp.coo_matrix(5*(np.random.rand(10,10)>0.7).astype(int))
     ...: Gd2 = sp.coo_matrix(6*(np.random.rand(10,10)>0.7).astype(int))
     ...: 

In [365]: # Run original and proposed approaches
     ...: G = np.vstack((Gx.toarray(), Gy.toarray(), Gd1.toarray(), Gd2.toarray()))
     ...: Gout = sp.vstack((Gx,Gy,Gd1,Gd2))
     ...: 

In [366]: # Finally verify results
     ...: np.allclose(G,Gout.toarray())
Out[366]: True

matlab to python port optimization

Answers (1)

Related Questions