Fabio Correa
Fabio Correa

Reputation: 1363

How to apply a function inside a nested vector in data.table?

I have a data.table with nested fields such as:

library(data.table)

dt <- data.table(name_var = c("Abel","Abel", "Bill", "Bill", "Craig", "Craig", "Craig")
           , value_var = c(1,2,3,4,5,6,7)
           , car_color = c("B","B","B","G","G","G","G") 
)

dt_2 <- dt[,.(.(.SD)), by = name_var]

The function below transforms data through an specified function:

transform_value <- function(x, fun, campo, ...) {
  x [, get (fun)(get(campo), ...)]
}

For instance, one may calculate the average from the value_var inside the nested V1:

dt_2[, mean_value:=lapply(V1, transform_value, "mean", "value_var")]

that corrrectly results in

   name_var           V1 mean_value
1:     Abel <data.table>        1.5
2:     Bill <data.table>        3.5
3:    Craig <data.table>          6

However, when I try to calculate the log of the nested value_var I get:

dt_2[, log_value:=lapply(V1, transform_value, "log", "value_var")]

that results in:

   name_var           V1 mean_value                  log_value
1:     Abel <data.table>        1.5        0.0000000,0.6931472
2:     Bill <data.table>        3.5          1.098612,1.386294
3:    Craig <data.table>          6 1.609438,1.791759,1.945910

Although the values are correct, actually I would like to have the log values side by side inside the V1, such as:

> dt_2$V1[[1]]

   value_var car_color  log_value
1:         1         B  0.0000000
2:         2         B  0.6931472

How do I do that?

Thank you.

Upvotes: 1

Views: 109

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

mean works because it returns a scalar value when given a vector whereas log returns a vector. Here is an option to modify your function:

transform_value <- function(x, fun, campo, ...) {
    a <- x[, c(.SD, .(match.fun(fun)(get(campo), ...)))]
    setnames(a, names(a)[length(a)], paste0(fun, "_value"))
}

dt_2[, V1 := lapply(V1, transform_value, "log", "value_var")]

dt_2$V1:

[[1]]
   value_var car_color log_value
1:         1         B 0.0000000
2:         2         B 0.6931472

[[2]]
   value_var car_color log_value
1:         3         B  1.098612
2:         4         G  1.386294

[[3]]
   value_var car_color log_value
1:         5         G  1.609438
2:         6         G  1.791759
3:         7         G  1.945910

Upvotes: 1

Related Questions