Reputation: 570
I'm wondering if there's a way to apply a function in a string variable to .SD cols in a data.table.
I can generalize all other parts of function calls using a data.table, including input and output columns, which I'm very happy about. But the final piece seems to be applying a variable function to a data.table, which is something I believe I've done before with dplyr and do.call.
mtcars <- as.data.table(mtcars)
returnNames <- "calculatedColumn"
SDnames <- c("mpg","hp")
myfunc <- function(data) {
print(data)
return(data[,1]*data[,2])
}
This obviously works:
mtcars[,eval(returnNames) := myfunc(.SD),.SDcols = SDnames,by = cyl]
But if I want to apply a dynamic function, something like this does not work:
functionCall <- "myfunc"
mtcars[,eval(returnNames) := lapply(.SD,eval(functionCall)),.SDcols = SDnames,by = cyl]
I get this error:
Error in `[.data.table`(mtcars, , `:=`(eval(returnNames), lapply(.SD, : attempt to apply non-function
Is using "apply" with "eval" the right idea, or am I on the wrong track entirely?
Upvotes: 3
Views: 93
Reputation: 28675
lapply
. Since myfunc
takes a data.table
with multiple columns, you just want to feed such a data table into the function as one object.get
instead of eval
:=
, you can just put the character vector in parentheses, eval
isn't needed-
mtcars[, (returnNames) := get(functionCall)(.SD)
, .SDcols = SDnames
, by = cyl]
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb calculatedColumn
# 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2310.0
# 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2310.0
# 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 2120.4
# 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 2354.0
# 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3272.5
# 6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 1900.5
The code above was run after the following code
mtcars <- as.data.table(mtcars)
returnNames <- "calculatedColumn"
SDnames <- c("mpg","hp")
myfunc <- function(data) {
print(data)
return(data[,1]*data[,2])
}
functionCall <- "myfunc"
Upvotes: 3