Reputation: 453
I would like to give functions to be performed in the j slot of a data.table as arguments in the style of:
DT <- as.data.table(structure(list(peak.grp = c(1L, 2L, 2L, 2L, 2L), s = c(248, 264,
282, 304, 333), height = c(222772.8125, 370112.28125, 426524.03125, 649691.75, 698039)), class = "data.frame", row.names = c(NA,
-5L)))
list_of_functions_with_parameters <- list(sum = list(x = s, na.rm = TRUE), mean = list(x = height, na.rm = TRUE))
vector_of_variable_names <- c("Sum.s", "Mean.height")
vector_for_by <- c("peak.grp")
perform_dt_operations <-
function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){
DT <- DT[, .(vector_of_variable_names = list_of_functions_with_parameters), by = row.names(DT)]
return(DT)
}
The output should then be:
Output <- perform_dt_operations(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by)
dput(as.data.frame(Output))
structure(list(peak.grp = c(1, 2), Sum.s = c(248, 1183), Mean.height = c(222772.8125,
536091.765625)), row.names = c(NA, -2L), class = "data.frame")
Is there a way to do something like that?
Upvotes: 0
Views: 41
Reputation: 132706
This is only possible if the elements of list_of_functions_with_parameters
are quoted, which means it needs to be an alist
.
list_of_functions_with_parameters <- alist(sum = list(x = s, na.rm = TRUE),
mean = list(x = height, na.rm = TRUE))
vector_of_variable_names <- c("Sum.s", "Mean.height")
vector_for_by <- c("peak.grp")
perform_dt_operations <-
function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){
stopifnot(length(vector_of_variable_names) == length(list_of_functions_with_parameters))
DT[,{
res <- list()
for (i in seq_along(vector_of_variable_names)) {
l <- eval(list_of_functions_with_parameters[[i]]) #evaluate within .SD
res[vector_of_variable_names[i]] <-
do.call(names(list_of_functions_with_parameters)[i], l)
}
res
}, by = vector_for_by]
}
perform_dt_operations(DT, vector_of_variable_names,
list_of_functions_with_parameters, vector_for_by)
# peak.grp Sum.s Mean.height
#1: 1 248 222772.8
#2: 2 1183 536091.8
As you see, this is some fairly complex code. I'm not sure I'd recommend this approach.
Upvotes: 3