anthonybell
anthonybell

Reputation: 5998

loading a sparse matrix saved with np.save

I saved a scipy csr matrix using np.save('X', X). When I load it with np.load('X.npy'), I get this signiture:

array(<240760x110493 sparse matrix of type '<class 'numpy.float64'>' with 20618831 stored elements in Compressed Sparse Row format>, dtype=object)

However, I cannot access this data using indexes (such as X[0,0] or X[:10,:10] or X[0] all give error IndexError: too many indices for array) and calling .shape returns ().

Is there a way to access this data, or is it corrupt now?

Edit.

Since there are 3 options to save/load a matrix I ran a speed comparison to see which works the best for my sparse matrix:

Writing a sparse matrix:

%timeit -n1 scipy.io.savemat('tt', {'t': X})
1 loops, best of 3: 66.3 ms per loop

timeit -n1 scipy.io.mmwrite('tt_mm', X)
1 loops, best of 3: 7.55 s per loop

timeit -n1 np.save('tt_np', X)
1 loops, best of 3: 188 ms per loop

Reading a sparse matrix:

timeit -n1 scipy.io.loadmat('tt')
1 loops, best of 3: 9.78 ms per loop

%timeit -n1 scipy.io.mmread('tt_mm')
1 loops, best of 3: 5.72 s per loop

%timeit -n1 np.load('tt_np.npy')
1 loops, best of 3: 150 ms per loop

The results are that mmread/mmwrite are incredibly low (~100s times slower), and savemat/loadmat is 3-10 times faster than save/load.

Upvotes: 3

Views: 1622

Answers (1)

hpaulj
hpaulj

Reputation: 231385

Let's pay attention to all the clues in the print

array(<240760x110493 sparse matrix of type '<class 'numpy.float64'>'
     with 20618831 stored elements in Compressed Sparse Row format>, dtype=object)

Outermost:

array(....,dtype=object)

A sparse matrix is not a regular array; to np.save, it is just an Python object. So it wrapped it in a dtype=object and saved that. It is a 0d array (hence the () shape), so all the indexing attempts fail. Try instead

M=arr.item() # or
M=arr[()]

Now M should display as:

sparse matrix of type '<class 'numpy.float64'>'
     with 20618831 stored elements in Compressed Sparse Row format

with attributes like M.shape. M.A will display the dense form, to it's too large to do that usefully.

Upvotes: 5

Related Questions