Feenaboccles
Feenaboccles

Reputation: 412

Why does np.load does not return a usable csr_matrix saved by np.save

If I save a CSR matrix using numpy.save(), then try to load it in via numpy.load(), a huge number of properties disappear: in particular there is no shape, and it's not possible to access values by index. Is this normal?

In the example below I create a CSR matrix from three arrays: the data, the indices and the index pointers. I then save it, load it back, and demonstrate the failure of the shape and index operations on the saved version.

> import numpy as np
> import scipy as sp
> import scipy.sparse as ssp

> wd
Out[1]: 
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int16)

> wi
Out[1]: 
array([200003,      1, 200009, 300000, 200002, 200006, 200007, 250000,
       300500, 200010, 300501, 200001, 200000,      0, 200008, 200004,
       200005, 200011, 200018,      2, 200019, 200013, 300001, 200014,
       200015, 200022, 200012, 200020, 200021, 200016, 200017, 200023,
       200027,      2, 200030, 200032, 200028, 200033, 200031, 200029,
       200026, 200025, 200024, 200047,      2, 200042, 200045, 200046,
       200028, 200038, 200040, 200039, 200036, 200037, 200012, 200048,
       200041, 200035, 200044, 200043, 200034, 200049,      3, 200050,
            4], dtype=int32)

> wp
Out[1]: array([ 0, 18, 31, 43, 61, 65], dtype=int32)

> ww = ssp.csr_matrix((wd,wi,wp))

> ww.shape
Out[1]: (5, 300502)

> ww[2,3]
Out[1]: 0

> ww[0,0]
Out[1]: 1

> np.save('/Users/bryanfeeney/Desktop/ww.npy', ww)
> www = np.load('/Users/bryanfeeney/Desktop/ww.npy')

> www.shape
Out[1]: ()

> www[2,3]
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-35f1349fb755>", line 1, in <module>
    www[2,3]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index

> www[0,0]
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/IPython/core/interactiveshell.py", line 2732, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-43c5da404060>", line 1, in <module>
    www[0,0]
IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index

Here's the version information for the python runtime, numpy and scipy respectively.

> sys.version
Out[1]: '3.3.2 (default, May 21 2013, 11:50:47) \n[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))]'

> np.__version__
Out[1]: '1.7.1'

> sp.__version__
Out[1]: '0.12.0'

Upvotes: 2

Views: 596

Answers (2)

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58915

This seems to be a bug, but you can pickle the whole sparse-matrix object:

import pickle
with open('ww.pkl', 'w') as f:
    pickle.dump(w, f)

and when you want to load:

with open('ww.pkl') as f:
    ww = pickle.load(f)

Upvotes: 0

doctorlove
doctorlove

Reputation: 19252

The three variables wd, wi and wp make up your sparse matrix. You need to save all three of these, since numpy save deals with numpy arrays.
Then having loaded them, say as wwd, wwi and wwp make a new matrix

new_csr = csr_matrix((wwd, wwi, wwp), shape=(M, N))

See here for a similar discussion.

Upvotes: 0

Related Questions