SciPy sparse matrix (COO,CSR): Clear row

Question

For creating a scipy sparse matrix, I have an array or row and column indices I and J along with a data array V. I use those to construct a matrix in COO format and then convert it to CSR,

matrix = sparse.coo_matrix((V, (I, J)), shape=(n, n))
matrix = matrix.tocsr()

I have a set of row indices for which the only entry should be a 1.0 on the diagonal. So far, I go through I, find all indices that need wiping, and do just that:

def find(lst, a):
    # From 
    return [i for i, x in enumerate(lst) if x in a]

# wipe_rows = [1, 55, 32, ...]  # something something

indices = find(I, wipe_rows)  # takes too long
I = numpy.delete(I, indices).tolist()
J = numpy.delete(J, indices).tolist()
V = numpy.delete(V, indices).tolist()

# Add entry 1.0 to the diagonal for each wipe row
I.extend(wipe_rows)
J.extend(wipe_rows)
V.extend(numpy.ones(len(wipe_rows)))

# construct matrix via coo

That works alright, but find tends to take a while.

Any hints on how to speed this up? (Perhaps wiping the rows in COO or CSR format is a better idea.)

Nico Schl&#246;mer · Accepted Answer

If you intend to clear multiple rows at once, this

def _wipe_rows_csr(matrix, rows):
    assert isinstance(matrix, sparse.csr_matrix)

    # delete rows
    for i in rows:
        matrix.data[matrix.indptr[i]:matrix.indptr[i+1]] = 0.0

    # Set the diagonal
    d = matrix.diagonal()
    d[rows] = 1.0
    matrix.setdiag(d)

    return

is by far the fastest method. It doesn't really remove the lines, but sets all entries to zeros, then fiddles with the diagonal.

If the entries are actually to be removed, one has to do some array manipulation. This can be quite costly, but if speed is no issue: This

def _wipe_row_csr(A, i):
    '''Wipes a row of a matrix in CSR format and puts 1.0 on the diagonal.
    '''
    assert isinstance(A, sparse.csr_matrix)

    n = A.indptr[i+1] - A.indptr[i]

    assert n > 0

    A.data[A.indptr[i]+1:-n+1] = A.data[A.indptr[i+1]:]
    A.data[A.indptr[i]] = 1.0
    A.data = A.data[:-n+1]

    A.indices[A.indptr[i]+1:-n+1] = A.indices[A.indptr[i+1]:]
    A.indices[A.indptr[i]] = i
    A.indices = A.indices[:-n+1]

    A.indptr[i+1:] -= n-1

    return

replaces a given row i of the matrix matrix by the entry 1.0 on the diagonal.

SciPy sparse matrix (COO,CSR): Clear row

Answers (2)

Related Questions