Reputation: 431
I have a huge matrix with with values 1, 2 or 3 ( and some NA). If the matrix is n x m then I have to recode to n x 3m with each value of orginal matrix correspond to 3 entries of new matrix. If value is x in old matrix then xth entry will be 1 and other two would be zeros (if NA all of them zero).
1, 3, NA, 1
is recoded to
1 0 0 0 0 1 0 0 0 1 0 0
I.e.
1 = 1 0 0
3 = 0 0 1
NA = 0 0 0
1 = 1 0 0
I have to do this efficiently in R because matrix is huge. What is most efficient way to do this? The matrix is in a data.table.
Upvotes: 2
Views: 237
Reputation: 93813
With a pre-allocated empty matrix.
mat <- matrix(c(1,3,NA,1,1,3,NA,1),nrow=2,byrow=TRUE)
mat
# [,1] [,2] [,3] [,4]
#[1,] 1 3 NA 1
#[2,] 1 3 NA 1
newmat <- matrix(0, ncol=ncol(mat)*3, nrow=nrow(mat))
ind <- cbind(rep(1:nrow(mat),ncol(mat)), as.vector(mat + (col(mat)*3-3)))
newmat[ind] <- 1
newmat
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,] 1 0 0 0 0 1 0 0 0 1 0 0
#[2,] 1 0 0 0 0 1 0 0 0 1 0 0
You can also use this method with a sparse matrix from the Matrix
package.
library(Matrix)
newmat <- Matrix(0, ncol=ncol(mat)*3, nrow=nrow(mat),sparse=TRUE)
newmat[ind[complete.cases(ind),]] <- 1
newmat
#2 x 12 sparse Matrix of class "dgCMatrix"
#
#[1,] 1 . . . . 1 . . . 1 . .
#[2,] 1 . . . . 1 . . . 1 . .
Using a sparse matrix has a number of advantages including significantly reduced memory use.
Upvotes: 3