Reputation: 2051
I have the following example:
irisDT <- as.data.table(iris)
mod <- function(dat) {
dat[, index:=(1:nrow(dat))]
setkey(dat, index)
dat <- dat[2:10]
dat[, index:=NULL]
invisible()
}
mod(irisDT)
names(irisDT) # it contains index
To my surprise, the index column still exists after calling the mod()
function.
This is not the case when I delete the line dat <- dat[2:10]
.
I guess that, since rows cannot be deleted by reference yet, another
data.table is created.
However, I would like to delete the index column in the original
data.table.
Upvotes: 6
Views: 669
Reputation: 59612
Great question. data.table
is copied-on-change, by <-
, in the usual R way.
It isn't copied-on-change by :=
or the set*
functions (setkey
,setnames
,setattr
) provided by the data.table
package.
So it's not anything special about the data.table
object itself that decides copies or not, and it's passed as an argument to functions in exactly the same way as data.frame
. It's what you do on it inside the function that counts. The <-
operator copies-on-change and that's no different when used on a data.table
. The :=
operator, on the other hand, assigns by reference.
As you say, there is no way (yet) to delete rows by reference, so until then you'll need to use standard R syntax to assign the copy back to the symbol in calling scope.
As it happens, there was a slide on this at last night's LondonR talk which is now on the homepage under the presentation section (see slide with title copy()
).
Upvotes: 5