Jakob Schöpe
Jakob Schöpe

Reputation: 25

Fill matrix using names with Rcpp

Suppose that named elements of a vector - stored in list - should be assigned to the matching columns of a matrix (see example below).

library(microbenchmark)
set.seed(123)
myList <- list()
for(i in 1:10000) {
 myList[[i]] <- list(sample(setNames(rnorm(5), sample(LETTERS[1:5])), ceiling(runif(1,1,4))))
}

myMatrix <- matrix(NA, ncol = 5, nrow = 10000)
colnames(myMatrix) <- LETTERS[1:5]
for(i in 1:10000) {
 myMatrix[i, match(names(myList[[i]][[1]]), colnames(myMatrix))] <- myList[[i]][[1]] 
}
myList[[6]][[1]]
myMatrix[6,]

microbenchmark(for(i in 1:10000) {myMatrix[i, match(names(myList[[i]][[1]]), colnames(myMatrix))] <- myList[[i]][[1]]}, times = 10)

In this example, elements of 10,000 vectors are assigned to the matching columns of a matrix.

Problem

The assignment is slow (approximately 3.5 seconds)!

Question

How can I speed up this process in R or with Rcpp?

Upvotes: 0

Views: 178

Answers (1)

Roland
Roland

Reputation: 132746

Use rbindlist from package data.table. It can bind by matching column names.

library(microbenchmark)
n <- 10000
set.seed(123)
myList <- list()
for(i in 1:n) {
  myList[[i]] <- list(sample(setNames(rnorm(5), sample(LETTERS[1:5])), ceiling(runif(1,1,4))))
}

myMatrix <- matrix(NA, ncol = 5, nrow = n)
colnames(myMatrix) <- LETTERS[1:5]

library(data.table)
microbenchmark(match = for(i in 1:n) {myMatrix[i, match(names(myList[[i]][[1]]), colnames(myMatrix))] <- myList[[i]][[1]]}, 
               rbindlist = {
                 myMatrix1 <- as.matrix(rbindlist(lapply(myList, 
                                                         function(x) as.list(unlist(x))), 
                                                  fill = TRUE))
                 myMatrix1 <- myMatrix1[, order(colnames(myMatrix1))]
                 },
               times = 10)
#Unit: milliseconds
#     expr        min         lq       mean     median         uq        max neval cld
#    match 1392.52949 1496.40382 1599.63584 1605.39080 1690.98410 1761.67322    10   b
#rbindlist   48.76146   50.29176   51.66355   51.10672   53.75465   54.93798    10  a

all.equal(myMatrix, myMatrix1)
#TRUE

Upvotes: 2

Related Questions