Reputation: 3387
I have the following data.table (DT):
DT <- data.table(V1 = 1:3, V2 = 4:6, V3 = 7:9)
I would like to select a subset of the variables programmatically (dynamically), by using an object where the relevant variable names are stored. For example, I want to select the two columns "V1" and "V3" stored in a variable "keep"
keep <- c("V1", "V3")
If we were to select the "keep" columns from a data.frame, the following would work:
DT[keep]
Unfortunately, this is not working when this is a data.table. I thought the data.frame and data.table are identical with this kind of behavior, but apperently they aren't. Anybody able to advise on the correct syntax?
Upvotes: 28
Views: 37071
Reputation: 33488
Some more possibilities:
DT[, .SD, .SDcols = keep]
DT[, mget(keep)]
Upvotes: 3
Reputation: 115392
This is covered in FAQ 1.1, 1.2 and 2.17.
Some possibilities:
DT[, keep, with = FALSE]
DT[, c('V1', 'V3'), with = FALSE]
DT[, c(1, 3), with = FALSE]
DT[, list(V1, V3)]
The reason DF[c('V1','V3')]
works as it does for a data.frame
is covered in ?`[.data.frame`
Data frames can be indexed in several modes. When
[
and[[
are used with a single vector index (x[i]
orx[[i]]
), they index the data frame as if it were a list. In this usage adrop
argument is ignored, with a warning.
From data.table 1.10.2
, you may use the ..
prefix when subsetting columns programmatically:
When
j
is a symbol prefixed with..
it will be looked up in calling scope and its value taken to be column names or numbers [...] It is experimental.
Thus:
DT[ , ..keep]
# V1 V3
# 1: 1 7
# 2: 2 8
# 3: 3 9
Upvotes: 37