Tony Tannous
Tony Tannous

Reputation: 14876

Python create an empty sparse matrix

I am trying to parse some real data into a .mat object to be loaded in my script.

I am getting this error:

TypeError: 'coo_matrix' object does not support item assignment

I found coo_matrix. However, I am not able to assign values to it.

data.txt

10 45
11 12 
4 1

I would like to get a sparse matrix of size 100x100. And to assign 1's to

Mat(10, 45) = 1
Mat(11, 12) = 1
Mat(4, 1) = 1

CODE

import numpy as np
from scipy.sparse import coo_matrix

def pdata(pathToFile):
    M = coo_matrix(100, 100)
    with open(pathToFile) as f:
        for line in f:
            s = line.split()
            x, y = [int(v) for v in s]
            M[x, y] = 1     
    return M

if __name__ == "__main__":
    M = pdata('small.txt')  

Any suggestions please ?

Upvotes: 6

Views: 12781

Answers (2)

hpaulj
hpaulj

Reputation: 231385

Constructing this matrix with coo_matrix, using the (data, (rows, cols))` parameter format:

In [2]: from scipy import sparse
In [3]: from scipy import io
In [4]: data=np.array([[10,45],[11,12],[4,1]])
In [5]: data
Out[5]: 
array([[10, 45],
       [11, 12],
       [ 4,  1]])
In [6]: rows = data[:,0]
In [7]: cols = data[:,1]
In [8]: data = np.ones(rows.shape, dtype=int)
In [9]: M = sparse.coo_matrix((data, (rows, cols)), shape=(100,100))
In [10]: M
Out[10]: 
<100x100 sparse matrix of type '<class 'numpy.int32'>'
    with 3 stored elements in COOrdinate format>
In [11]: print(M)
  (10, 45)  1
  (11, 12)  1
  (4, 1)    1

If you save it to a .mat file for use in MATLAB, it will save it in csc format (having converted it from the coo):

In [13]: io.savemat('test.mat',{'M':M})
In [14]: d = io.loadmat('test.mat')
In [15]: d
Out[15]: 
{'M': <100x100 sparse matrix of type '<class 'numpy.int32'>'
    with 3 stored elements in Compressed Sparse Column format>,
 '__globals__': [],
 '__header__': b'MATLAB 5.0 MAT-file Platform: posix, Created on: Mon Aug  7 08:45:12 2017',
 '__version__': '1.0'}

coo format does not implement item assignment. csr and csc do implement it, but will complain. But they are the normal formats for calculation. lil and dok are the best formats for iterative assignment.

Upvotes: 3

sascha
sascha

Reputation: 33532

Use a sparse format, which supports efficient indexing, like dok_matrix

This is an efficient structure for constructing sparse matrices incrementally.

...

Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed.

The last sentence can be generalized to: can be efficiently converted to all the other common formats if needed.

from scipy.sparse import dok_matrix

M = dok_matrix((100, 100))  # extra brackets needed as mentioned in comments
                            # thanks Daniel!
M[0,3] = 5

Upvotes: 11

Related Questions