Reputation: 1769
I have a list of vectors of characters, called l
. For example:
set.seed(42) ## for sake of reproducibility
genes <- paste("gene",1:20,sep="")
tot=data.frame(term=sample(genes,30, replace=T), num=sample(1:10, 30, replace=T), stringsAsFactors =
FALSE)
s1<-sample(genes,2, replace=F)
s2<-sample(genes,4, replace=F)
s3<-sample(genes,3, replace=F)
s4<-sample(genes,2, replace=F)
s5<-sample(genes,2, replace=F)
s6<-sample(genes,3, replace=F)
l=list(s1,s2,s3,s4,s5,s6)
By considering tot[tot$term%in%l[[1]],]
, I obtain:
term num
1 gene17 4
3 gene1 6
7 gene17 2
26 gene1 6
and I put
df=tot[tot$term%in%l[[1]],]
sum(df$num)
I can obtain the total values of second column, i.e. 18. For the other elements of the list I obtain, respectively: 32 13 19 17 29
. This can be achieved by a for loop:
v<-vector()
for (j in 1:length(l)) {
df=tot[tot$term%in%l[[j]],]
v<-c(v,sum(df$num))
}
I would like to know if there is a more simple way of doing this.
Upvotes: 1
Views: 40
Reputation: 388807
Here is one tidyverse
way :
library(tidyverse)
enframe(l, value = 'term') %>%
unnest(term) %>%
left_join(tot, by = 'term') %>%
group_by(name) %>%
summarise(num = sum(num, na.rm = TRUE))
# name num
#* <int> <int>
#1 1 18
#2 2 32
#3 3 13
#4 4 19
#5 5 17
#6 6 29
Upvotes: 1
Reputation: 886938
It can be simplified with sapply
v2 <- sapply(l, function(j) sum(tot$num[tot$term %in% j]))
-checking with OP's loop output
identical(v, v2)
#[1] TRUE
Or a more compact way with map
library(purrr)
map_dbl(l, ~ sum(tot$num[tot$term %in% .x]))
Or with tidyverse
library(dplyr)
stack(setNames(l, seq_along(l))) %>%
group_by(ind) %>%
summarise(Sum = tot %>%
filter(term %in% values) %>%
pull(num) %>%
sum) %>%
pull(Sum)
Upvotes: 2