Reputation: 2669
I have a csr_matrix 'a' type of sparse matrix. I want to perform an operation to create a new csr_matrix 'b' where each row of 'b' is same ith row of 'a'.
I think for normal numpy arrays it is possible using 'tile' operation. But I am not able to find the same for csr_matrix.
Making first a numpy matrix and converting to csr_matrix is not an option as the size of matrix is 10000 x 10000.
Upvotes: 1
Views: 1427
Reputation: 1
One can do
row = a.getrow(row_idx)
n_rows = a.shape[0]
b = tiled_row = sp.sparse.vstack(np.repeat(row, n_rows))
Upvotes: 0
Reputation: 2669
I actually could get to answer which doesn't require creating full numpy matrix and is quite fast for my purpose. So adding it as answer if it's useful for people in future:
rows, cols = a.shape
b = scipy.sparse.csr_matrix((np.tile(a[2].data, rows), np.tile(a[2].indices, rows),
np.arange(0, rows*a[2].nnz + 1, a[2].nnz)), shape=a.shape)
This takes 2nd row of 'a' and tiles it to create 'b'.
Following is the timing test, seems quite fast for 10000x10000 matrix:
100 loops, best of 3: 2.24 ms per loop
Upvotes: 2
Reputation: 231395
There is a blk
format, that lets you create a new sparse matrix from a list of other matrices.
So for a start you could
a1 = a[I,:]
ll = [a1,a1,a1,a1]
sparse.blk_matrix(ll)
I don't have a shell running to test this.
Internally this format turns all input arrays into coo
format, and collects their coo
attributes into 3 large lists (or arrays). In your case of tiled rows, the data
and col
(j) values would just repeat. The row
(I) values would step.
Another way to approach it would be to construct a small test matrix, and look at the attributes. What kinds of repetition do you see? It's easy to see patterns in the coo
format. lil
might also be easy to replicate, maybe with the list *n
operation. csr
is trickier to understand.
Upvotes: 0