Reputation: 412
If I save a CSR matrix using numpy.save(), then try to load it in via numpy.load(), a huge number of properties disappear: in particular there is no shape, and it's not possible to access values by index. Is this normal?
In the example below I create a CSR matrix from three arrays: the data, the indices and the index pointers. I then save it, load it back, and demonstrate the failure of the shape and index operations on the saved version.
> import numpy as np
> import scipy as sp
> import scipy.sparse as ssp
> wd
Out[1]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int16)
> wi
Out[1]:
array([200003, 1, 200009, 300000, 200002, 200006, 200007, 250000,
300500, 200010, 300501, 200001, 200000, 0, 200008, 200004,
200005, 200011, 200018, 2, 200019, 200013, 300001, 200014,
200015, 200022, 200012, 200020, 200021, 200016, 200017, 200023,
200027, 2, 200030, 200032, 200028, 200033, 200031, 200029,
200026, 200025, 200024, 200047, 2, 200042, 200045, 200046,
200028, 200038, 200040, 200039, 200036, 200037, 200012, 200048,
200041, 200035, 200044, 200043, 200034, 200049, 3, 200050,
4], dtype=int32)
> wp
Out[1]: array([ 0, 18, 31, 43, 61, 65], dtype=int32)
> ww = ssp.csr_matrix((wd,wi,wp))
> ww.shape
Out[1]: (5, 300502)
> ww[2,3]
Out[1]: 0
> ww[0,0]
Out[1]: 1
> np.save('/Users/bryanfeeney/Desktop/ww.npy', ww)
> www = np.load('/Users/bryanfeeney/Desktop/ww.npy')
> www.shape
Out[1]: ()
> www[2,3]
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-35f1349fb755>", line 1, in <module>
www[2,3]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index
> www[0,0]
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-1-43c5da404060>", line 1, in <module>
www[0,0]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index
Here's the version information for the python runtime, numpy and scipy respectively.
> sys.version
Out[1]: '3.3.2 (default, May 21 2013, 11:50:47) \n[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]'
> np.__version__
Out[1]: '1.7.1'
> sp.__version__
Out[1]: '0.12.0'
Upvotes: 2
Views: 596
Reputation: 58915
This seems to be a bug, but you can pickle the whole sparse-matrix object:
import pickle
with open('ww.pkl', 'w') as f:
pickle.dump(w, f)
and when you want to load:
with open('ww.pkl') as f:
ww = pickle.load(f)
Upvotes: 0
Reputation: 19252
The three variables wd
, wi
and wp
make up your sparse matrix. You need to save all three of these, since numpy save
deals with numpy arrays.
Then having loaded them, say as wwd, wwi and wwp make a new matrix
new_csr = csr_matrix((wwd, wwi, wwp), shape=(M, N))
See here for a similar discussion.
Upvotes: 0