Reputation: 839
Assume there is a data.table
with the IDs 1,2 and 3 and a VALUE column. There is also a list of permutations of these IDs that define the subsets I would like to calculate a statistic for. E.g. calculate the mean of VALUE for the three groups ID in {1 ,2}, {2 and 3}, {1 and 3}:
library(data.table)
DT <- data.table(ID = c(1,1,2,2,3,3),VALUE = c(1,2,10,20,100,200))
permutations <- list(c(1,2),c(2,3),c(1,3))
How can I generate the desired output
permutation MEAN
1 8.25
2 82.5
3 75.75
My pseudocode without a loop or apply
would look something like this but of course it does not work:
DT[,.(MEAN = mean(VALUE)),by = .(ID %in% permutations)]
Is it possible to do that without a loop or a call to apply? The takes very long for my table.
Upvotes: 1
Views: 34
Reputation: 887118
We can do a join the 'data.table' with the 'permutations' after converting ito into a 'data.frame' and then do a group by mean
DT[stack(setNames(permutations, seq_along(permutations))),
on = .(ID = values)][, .(MEAN = mean(VALUE)), by = .(permutation = ind)]
# permutation MEAN
#1: 1 8.25
#2: 2 82.50
#3: 3 75.75
Upvotes: 1
Reputation: 51592
A base R approach would be,
sapply(permutations, function(i) {x <- DT[DT$ID %in% i]; mean(x$VALUE)})
#[1] 8.25 82.50 75.75
Upvotes: 1