Reputation: 88
The FAQ states that the preferred way to add a new column to a data.table when programming is to use quote() and then eval(). But what if I want to add several columns at once? Playing around with this I came up with the following solution:
library(data.table)
DT <- data.table(V1=1:1000,
V2=2001:3000)
col.names <- c("V3","V4")
col.specs <- vector("list",2)
col.specs[[1]] <- quote(V1**2)
col.specs[[2]] <- quote((V1+V2)/2)
DT[,c(col.names) := lapply(col.specs,eval,envir=DT)]
which yields the desired result:
> head(DT)
V1 V2 V3 V4
1: 1 2001 1 1001
2: 2 2002 4 1002
3: 3 2003 9 1003
4: 4 2004 16 1004
5: 5 2005 25 1005
6: 6 2006 36 1006
My question is simply: is this the preferred method? Specifically, can someone think of a way to avoid specifying the environment in the lapply() call? If I leave it out I get:
> DT[,c(col.names) := lapply(col.specs,eval)]
Error in eval(expr, envir, enclos) : object 'V1' not found
It may be no big deal, but at least to me it feels a bit suspicious that the data table does not recognise its own columns. Also, if I add the columns one by one, there is no need to specify the environment:
> DT <- data.table(V1=1:1000,
+ V2=2001:3000)
> col.names <- c("V3","V4")
> col.specs <- vector("list",2)
> col.specs[[1]] <- quote(V1**2)
> col.specs[[2]] <- quote((V1+V2)/2)
> for (i in 1L:length(col.names)) {
+ DT[,col.names[i] := list(eval(col.specs[[i]]))]
+ }
> head(DT)
V1 V2 V3 V4
1: 1 2001 1 1001
2: 2 2002 4 1002
3: 3 2003 9 1003
4: 4 2004 16 1004
5: 5 2005 25 1005
6: 6 2006 36 1006
Upvotes: 2
Views: 95
Reputation: 66819
Since things are easier with a single quoted expression...
library(data.table)
DT <- data.table(V1=1:1000, V2=2001:3000)
new_cols = list(
V3 = quote(V1**2),
v4 = quote((V1+V2)/2)
)
e = as.call(c(quote(`:=`), new_cols))
DT[, eval(e)]
Then you can freely add to or edit new_cols
with the names in close proximity to the exprs.
Sources: Arun, and me citing him before.
Side note. The syntax above is
`:=`(col = v, col2 = v2, ...)
But we should also be able to do
c("col", "col2") := list(v, v2)
# aka
`:=`(c("col", "col2"), list(v, v2))
However, I can't figure out how to do it:
DT <- data.table(V1=1:1000, V2=2001:3000)
e2 = as.expression(list(quote(`:=`), names(new_cols), unname(new_cols)))
# gives an error:
DT[, eval(e2)]
# even though it works when written directly:
DT2[, `:=`(c("V3", "v4"), list(V1^2, (V1 + V2)/2))]
I'd like to know how to do that, though...
Upvotes: 1