Richard Erickson
Richard Erickson

Reputation: 2616

Using := with eval(as.symbol()) to create columns

I am using the data.table in R (version 3.3.2 on OS X 10.11.6) and have noticed a change in behavior from version 1.9.6 to 1.10.0 with respect to the use of the := operator and a character string for name.

I am renaming columns inside of a loop based upon the index number. Previously, I had been using eval(as.symbol("string")) on both sides of :=, but this no longer works (this was based upon answers from a previous question). Through trial and error, I figured out I needed use ("string") on of the left side and eval(as.symbol("string")) on the right hand side.

Here is MCVE that demonstrates this behavior

library(data.table)
dt <- data.table(col1 = 1:10, col2 = 11:20)

## the next lines would be inside a loop that is excluded to simplify this MCVE
colA = paste0("col", 1)
colB = paste0("col", 2)
colC = paste0("col", 3)

## Old code that worked with 1.9.6, but not longer works
dt[ , eval(as.symbol(colC)) := eval(as.symbol(colA)) + eval(as.symbol(colB))]

## New code that now works 1.10.0
dt[ , (colC) := eval(as.symbol(colA)) + eval(as.symbol(colB))]

I have looked through the data.table documentation and have not been able to figure out why this work around works. So, here is my question:

Why do I need the eval(as.symbol("string")) on the right side, but not on the left?

Upvotes: 0

Views: 720

Answers (1)

Akhil Nair
Akhil Nair

Reputation: 3274

From a discussion, it is now assumed that if j is a single string, it is evaluated as a symbol, so that, for example, dt[, "col" := 3] will also work.

There's be a fair bit of changing around with exactly when this became the default, but the full story is contained in both the previous post and the data.table news.

It may be of interest to you, however, that with

new_cols = c("j1", "j2")
dt[, (new_cols) := value]  # brackets so we don't just make a new_col col

or

dt[, c("j1", "j2") := value]

it may be possible for you to achieve the above without needing a loop

library(data.table)

dt = data.table(a = c(2, 3), b = c(5, 7), c = c(11, 13))

cols1 = sapply(c("a", "b"), as.symbol)
cols2 = sapply(c("b", "c"), as.symbol)
new_cols = c("d", "e")

> print(dt)
   a b  c
1: 2 5 11
2: 3 7 13

dt[, (new_cols) := purrr::map2(cols1, cols2, ~ eval(.x) + eval(.y))]

   a b  c  d  e
1: 2 5 11  7 16
2: 3 7 13 10 20

Upvotes: 2

Related Questions