Reputation: 31
I have a very large sparse matrix in R. For specified rows, I want to get out only the nonzero values from the respective columns (typically 5-10 out of 10000). Using the View option, only a very small subset of the matrix can be visualized (exceeds memory, I guess). I get the same problem, when I use e.g. A[1, ] to get out the first row of A.. I would like to get a vector containing only the column indices and corresponding values, where the value is above zero, whenever I specify a specific row of the matrix. Is there a smart way of doing this?
Upvotes: 3
Views: 2922
Reputation: 454
Assuming you have a sparse dgCMatrix and the user-selected row is in variable 'rowIndx', the following code will create an index of all non-zero values and then pick user-selected row of interest from that.
rowIndx <- 2
mm <- Matrix::Matrix(matrix(rbinom(2e4, 1, 0.10), ncol = 100))
Create the indices of non-zero elements
colN <- diff(mm@p) #get the number of non-zero elements in each column
indx <- cbind(mm@i+1,rep(seq_along(colN),colN)) #create the indices of all non-zero elements
Get the required column indices and values
indx[which(indx[,1]==rowIndx),2] #vector of non-zero column indices
mm[rowIndx,indx[which(indx[,1]==rowIndx),2]] #vector of non-zero values
This method is three times faster than creating indices with 'which'
indx <- which(mm!=0,arr.ind = T)
method for large dgCMatrix with 2e8 elements.
Upvotes: 2