Scipy sparse matrix row broadcasing

Question

I've been recently trying to do the following (efficiently)

Read a sparse (csr) matrix
Select a subset of rows
Construct another matrix (all zeros)
Fill 3. with the subset obtained in 2.

I can almost achieve this as follows:

input_matrix = scipy.io.loadmat(some_matrix)

random_indices = np.random.choice(input_matrix.shape[1], num_samples, replace=False)

second_matrix = sp.dok_matrix(input_matrix.shape)

## this takes up too much memory!
second_matrix[random_indices] = input_matrix[random_indices]

How does one do this more efficiently? I would not like to call .todense() at any point, as this would also explode in memory. Intuitively, one should be able to maybe mask a part of the matrix? In numpy (dense), I would simply fill the remainder with zeros, but for csr matrices I am not sure whether this is the way.

Thanks!

Scipy sparse matrix row broadcasing

Answers (1)

Related Questions