dayne
dayne

Reputation: 7784

Using data.table for multiple aggregation steps

I am trying to do multiple aggregation steps using data.table. First I want to find the median value at each concentration for a specific type of sample by plate, then I want to find the maximum of the medians for each plate.

library(data.table)

set.seed(1)
DT <- data.table(plate = rep(paste0("plate",1:3),each=11),
                 type = rep(c(rep(1,9),2,2),3),
                 value = sample(1:25,33,replace=TRUE),
                 conc = rep(c(rep(1:3,each=3),4,4),3)
                 )

I got the following to work:

DT[,med := median(value[type==1]),by=list(plate,conc)]
DT[,max := max(med,na.rm=TRUE),by=plate]

Is it possible to do a multiple step aggregation without adding the intermediate med column?

Upvotes: 2

Views: 297

Answers (1)

eddi
eddi

Reputation: 49448

You could e.g. do the following:

DT[, max := max(.SD[, median(value[type == 1]), by = conc]$V1, na.rm = T),
     by = plate]

but I'm pretty sure your two line way is much faster.

Upvotes: 3

Related Questions