Reputation: 1363
I have a data.table with nested fields such as:
library(data.table)
dt <- data.table(name_var = c("Abel","Abel", "Bill", "Bill", "Craig", "Craig", "Craig")
, value_var = c(1,2,3,4,5,6,7)
, car_color = c("B","B","B","G","G","G","G")
)
dt_2 <- dt[,.(.(.SD)), by = name_var]
The function below transforms data through an specified function:
transform_value <- function(x, fun, campo, ...) {
x [, get (fun)(get(campo), ...)]
}
For instance, one may calculate the average from the value_var
inside the nested V1
:
dt_2[, mean_value:=lapply(V1, transform_value, "mean", "value_var")]
that corrrectly results in
name_var V1 mean_value
1: Abel <data.table> 1.5
2: Bill <data.table> 3.5
3: Craig <data.table> 6
However, when I try to calculate the log of the nested value_var
I get:
dt_2[, log_value:=lapply(V1, transform_value, "log", "value_var")]
that results in:
name_var V1 mean_value log_value
1: Abel <data.table> 1.5 0.0000000,0.6931472
2: Bill <data.table> 3.5 1.098612,1.386294
3: Craig <data.table> 6 1.609438,1.791759,1.945910
Although the values are correct, actually I would like to have the log values side by side inside the V1
, such as:
> dt_2$V1[[1]]
value_var car_color log_value
1: 1 B 0.0000000
2: 2 B 0.6931472
How do I do that?
Thank you.
Upvotes: 1
Views: 109
Reputation: 25225
mean
works because it returns a scalar value when given a vector whereas log
returns a vector. Here is an option to modify your function:
transform_value <- function(x, fun, campo, ...) {
a <- x[, c(.SD, .(match.fun(fun)(get(campo), ...)))]
setnames(a, names(a)[length(a)], paste0(fun, "_value"))
}
dt_2[, V1 := lapply(V1, transform_value, "log", "value_var")]
dt_2$V1
:
[[1]]
value_var car_color log_value
1: 1 B 0.0000000
2: 2 B 0.6931472
[[2]]
value_var car_color log_value
1: 3 B 1.098612
2: 4 G 1.386294
[[3]]
value_var car_color log_value
1: 5 G 1.609438
2: 6 G 1.791759
3: 7 G 1.945910
Upvotes: 1