Alfstat
Alfstat

Reputation: 11

colnames() behaviour with data.table in R

Using colnames() function with a data.table seems to convert the resulting variable to a "passed by reference" one. I'm using R 3.6.0 and data.table 1.12.2

library(data.table)
DT = data.table(
  ID = c("b","b","b","a","a","c"),
  a = 1:6,
  b = 7:12,
  c = 13:18
)

column_names = colnames(DT)
DT[, e := 23:28]
column_names 

I expected column_names to be "ID" "a" "b" and "c" not including the newly added column "e". However column_names has been updated. Is this behaviour normal?

Upvotes: 1

Views: 78

Answers (1)

akrun
akrun

Reputation: 887118

We need to use copy to avoid it getting changed after the assignment based on the documentation of ?copy

A copy() may be required when doing dt_names = names(DT). Due to R's copy-on-modify, dt_names still points to the same location in memory as names(DT). Therefore modifying DT by reference now, say by adding a new column, dt_names will also get updated. To avoid this, one has to explicitly copy: dt_names <- copy(names(DT))

So, we do

column_names = copy(colnames(DT))

Now, after the assignment

DT[, e := 23:28]
column_names 
#[1] "ID" "a"  "b"  "c" 

Upvotes: 3

Related Questions