Leo
Leo

Reputation: 781

Python scipy.sparse: how to efficiently set a set of entries to 0?

Let a be a big scipy.sparse matrix and IJ={(i0,j0),(i1,j1),...} a set of positions. How can I efficiently set all the entries in a in positions IJ to 0? Something like a[IJ]=0.

In Mathematica, I would create a new sparse matrix b with background value 1 (instead of 0) and all entries in IJ. Then, I would use a=a*b (entry-wise multiplication). That does not seem to be an option here.

A toy example:

import scipy.sparse as sp
import numpy as np
np.set_printoptions(linewidth=200,edgeitems=5,precision=4)
m=n=10**1;
a=sp.random(m,n,4/m,format='csr'); print(a.toarray())
IJ=np.array([range(0,n,2),range(0,n,2)]); print(IJ) #every second diagonal

Upvotes: 0

Views: 202

Answers (2)

Crawl Cycle
Crawl Cycle

Reputation: 287

The scipy sparse matrices can't have a non-zero background value. While it it possible to make a "sparse" matrix with lots of non-zero value, the performance (speed & memory) would be far worse than dense matrix multiplication.

A possible work-around is to rewrite every sparse matrix to have a default value of zero. For example, if matrix Y' contains mostly 1, I can replace Y' by I - Y where Y = I - Y' and I is the identity matrix.

import scipy.sparse as sp
import numpy as np

size = (100, 100)
x = np.random.uniform(-1, 1, size=size)
y = sp.random(*size, 0.001, format='csr')

# Z = (I - Y)X = X - YX
z = x - y.multiply(x)

# A = X(I - Y) = X - XY = X - transpose(YX)
a = x - y.multiply(x).T

Upvotes: 1

orlp
orlp

Reputation: 117681

You are almost there. To go by your definitions, all you'd need to do is:

a[IJ[0],IJ[1]] = 0

Note that scipy will warn you:

SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.

You can read more about that here.

Upvotes: 1

Related Questions