Reputation: 189
I have a matrix data
and a list
. I would like to find matches between each vector in my list with row names of my matrix, calculate their mean and add those values in data frame df
with new coumn of data frame having the same name as the names of each vector in the list.
I have done it manually, but I am wondering how can I do it in a for loop
manner with more efficient codes.
data <- matrix(runif(75, 5.0, 10), nrow = 15, ncol = 5)
rownames(data) <- paste0("GENE",1:15)
colnames(data) <- paste0("COL",1:5)
list <- list(n = c("GENE1","GENE2","GENE3"), s = c("GENE4", "GENE5","GENE6","GENE7"),b = c("GENE8","GENE10", "GENE12", "GENE15"))
df <- data.frame(row.names = colnames(data))
df$n <- matrixStats::rowMeans2(t(data[intersect(row.names(data),list$n),]))
df$s <- matrixStats::rowMeans2(t(data[intersect(row.names(data),list$s),]))
df$b <- matrixStats::rowMeans2(t(data[intersect(row.names(data),list$b),]))
Upvotes: 0
Views: 75
Reputation: 388817
If you want to optimise instead of loops try lapply
in base R, where we subset rows based on rownames
in each element of list and then take mean of every column using colMeans
.
t(do.call("rbind", lapply(lst, function(x)
colMeans(data[rownames(data) %in% x,]))))
# n s b
#COL1 7.242129 7.667626 6.980115
#COL2 7.317233 6.297818 6.186642
#COL3 6.709917 7.061652 7.552923
#COL4 7.773472 6.741069 7.765780
#COL5 7.039789 6.584206 7.569894
data
set.seed(1234)
data <- matrix(runif(75, 5.0, 10), nrow = 15, ncol = 5)
rownames(data) <- paste0("GENE",1:15)
colnames(data) <- paste0("COL",1:5)
lst <- list(n=c("GENE1","GENE2","GENE3"), s = c("GENE4", "GENE5","GENE6","GENE7"),
b = c("GENE8","GENE10", "GENE12", "GENE15"))
Upvotes: 1