yasel
yasel

Reputation: 453

Passing data.tables j slot (variable names, functions as well as arguments) as argument

I would like to give functions to be performed in the j slot of a data.table as arguments in the style of:

   DT <- as.data.table(structure(list(peak.grp = c(1L, 2L, 2L, 2L, 2L), s = c(248, 264, 
282, 304, 333), height = c(222772.8125, 370112.28125, 426524.03125, 649691.75, 698039)), class = "data.frame", row.names = c(NA, 
-5L)))

list_of_functions_with_parameters <- list(sum = list(x = s, na.rm = TRUE), mean = list(x = height, na.rm = TRUE))

vector_of_variable_names <- c("Sum.s", "Mean.height")

vector_for_by <- c("peak.grp")

perform_dt_operations <-
    function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){

    DT <- DT[, .(vector_of_variable_names = list_of_functions_with_parameters), by = row.names(DT)]

    return(DT)

}

The output should then be:

Output <- perform_dt_operations(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by)


dput(as.data.frame(Output))

structure(list(peak.grp = c(1, 2), Sum.s = c(248, 1183), Mean.height = c(222772.8125, 
536091.765625)), row.names = c(NA, -2L), class = "data.frame")

Is there a way to do something like that?

Upvotes: 0

Views: 41

Answers (1)

Roland
Roland

Reputation: 132706

This is only possible if the elements of list_of_functions_with_parameters are quoted, which means it needs to be an alist.

list_of_functions_with_parameters <- alist(sum = list(x = s, na.rm = TRUE), 
                                       mean = list(x = height, na.rm = TRUE))

vector_of_variable_names <- c("Sum.s", "Mean.height")

vector_for_by <- c("peak.grp")

perform_dt_operations <-
  function(DT, vector_of_variable_names, list_of_functions_with_parameters, vector_for_by){

    stopifnot(length(vector_of_variable_names) == length(list_of_functions_with_parameters))

    DT[,{
      res <- list()
      for (i in seq_along(vector_of_variable_names)) {
        l <- eval(list_of_functions_with_parameters[[i]]) #evaluate within .SD
        res[vector_of_variable_names[i]] <- 
              do.call(names(list_of_functions_with_parameters)[i], l)
      }
      res       
    }, by = vector_for_by]
  }

perform_dt_operations(DT, vector_of_variable_names,
  list_of_functions_with_parameters, vector_for_by)

#   peak.grp Sum.s Mean.height
#1:        1   248    222772.8
#2:        2  1183    536091.8

As you see, this is some fairly complex code. I'm not sure I'd recommend this approach.

Upvotes: 3

Related Questions