Reputation: 13817
We would like to summarize a data table to create a lot of new variables that result from the combination of columns names and values from the original data. Here is reproducile example illustrating the result we would like to achieve with two columns only for the sake of brevity
library(data.table)
data('mtcars')
setDT(mtcars)
# Desired output
mtcars[, .(
acm_hp_carb2 = mean(hp[which( carb <= 2)], na.rm=T),
acm_wt_am1 = mean(wt[which( am== 1)], na.rm=T)
), by= .(cyl, gear)]
Because we want to summarize a lot of columns, we created a function that returns all the strings that we would use to create each summary variable. In this example, we have this:
a <- 'acm_hp_carb2 = mean(hp[which( carb <= 2)], na.rm=T)'
b <- 'acm_wt_am1 = mean(wt[which( am== 1)], na.rm=T)'
And here is our failed attempt. Note that the new columns created do not receive the names we want to assign to them.
mtcars[, .(
eval(parse(text=a)),
eval(parse(text=b))
), by= .(cyl, gear)]
Upvotes: 3
Views: 256
Reputation: 66819
You can make one call and eval it:
f = function(...){
ex = parse(text = sprintf(".(%s)", paste(..., sep=", ")))[[1]]
print(ex)
mtcars[, eval(ex), by=.(cyl, gear)]
}
f(a,b)
a2 <- 'acm_hp_carb2 = mean(hp[carb <= 2], na.rm=T)'
b2 <- 'acm_wt_am1 = mean(wt[am == 1], na.rm=T)'
f(a2, b2)
I guess the which() is not needed.
Upvotes: 1
Reputation: 28695
Seems like the only part which isn't working is the column names. If you put a
and b
in a vector and add names to them, you can use lapply
to do the eval(parse
and keep the names from the vector. I used regex to get the names, but presumably in the real code you can assign the names as whatever variable you're using to construct the strings in the first place.
Result has many NaNs but it matches your desired output.
to_make <- c(a, b)
to_make <- setNames(to_make, sub('^(.*) =.*', '\\1', to_make))
mtcars2[, lapply(to_make, function(x) eval(parse(text = x)))
, by= .(cyl, gear)]
# cyl gear acm_hp_carb2 acm_wt_am1
# 1: 6 4 NaN 2.747500
# 2: 4 4 76.0 2.114167
# 3: 6 3 107.5 NaN
# 4: 8 3 162.5 NaN
# 5: 4 3 97.0 NaN
# 6: 4 5 102.0 1.826500
# 7: 8 5 NaN 3.370000
# 8: 6 5 NaN 2.770000
Upvotes: 3