Avoiding automatic conversion of dgCMatrix to dgeMatrix

Question

I use the class dgCMatrix from the Matrix package to store a square matrix of about 255 million values, with a size of about 1.7MB .

However after I perform variable <- variable/rowSums(variable) where variable is the sparse matrix, the resulting variable changes into class dgeMatrix, and the size ballooned to almost 2GB, effectively taking up all memory available and in some instances crashing the script.

Is there a way to coerce the output to remain in the class dgCMatrix ?

I suspect that the reason is that the number of non-zero elements increase to the point that the matrix is no longer considered sparse, due to introduction of NaN in elements where the sum of rows is zero. If there's a work around to address the NaN 's , I'm open to that too. Note however that I cannot avoid producing the zero rows, because my matrix need to be a square, and the corresponding column sums are generally non-zero.

AodhanOL · Accepted Answer

You could try doing a simple ifelse function for the divisor:

variable <- variable/ifelse(rowSums(variable)!=0,rowSums(variable),1)

Unless there's some reason you need to be dividing by the 0 there, that seems like the simplest way to avoid NANs.

Avoiding automatic conversion of dgCMatrix to dgeMatrix

Answers (2)

Related Questions