Reputation: 5220

Select multiple columns in data.table by their numeric indices

How can we select multiple columns using a vector of their numeric indices (position) in data.table?

This is how we would do with a data.frame:

df <- data.frame(a = 1, b = 2, c = 3)
df[ , 2:3]
#   b c
# 1 2 3

Upvotes: 156

Answers (5)

Josh O'Brien

Reputation: 162461

For versions of data.table >= 1.9.8, the following all just work:

library(data.table)
dt <- data.table(a = 1, b = 2, c = 3)

# select single column by index
dt[, 2]
#    b
# 1: 2

# select multiple columns by index
dt[, 2:3]
#    b c
# 1: 2 3

# select single column by name
dt[, "a"]
#    a
# 1: 1

# select multiple columns by name
dt[, c("a", "b")]
#    a b
# 1: 1 2

For versions of data.table < 1.9.8 (for which numerical column selection required the use of with = FALSE), see this previous version of this answer. See also NEWS on v1.9.8, POTENTIALLY BREAKING CHANGES, point 3.

Upvotes: 196

rafa.pereira

Reputation: 13827

From v1.10.2 onwards, you can also use ..

dt <- data.table(a=1:2, b=2:3, c=3:4)

keep_cols = c("a", "c")

dt[, ..keep_cols]

Upvotes: 22

R Yoda

Reputation: 8770

If you want to use column names to select the columns, simply use .(), which is an alias for list():

library(data.table)
dt <- data.table(a = 1:2, b = 2:3, c = 3:4)
dt[ , .(b, c)] # select the columns b and c
# Result:
#    b c
# 1: 2 3
# 2: 3 4

Upvotes: 39

Bhoom Suktitipat

Reputation: 2217

@Tom, thank you very much for pointing out this solution. It works great for me.

I was looking for a way to just exclude one column from printing and from the example above. To exclude the second column you can do something like this

library(data.table)
dt <- data.table(a=1:2, b=2:3, c=3:4)
dt[,.SD,.SDcols=-2]
dt[,.SD,.SDcols=c(1,3)]

Upvotes: 3

Tom

Reputation: 1291

It's a bit verbose, but i've gotten used to using the hidden .SD variable.

b<-data.table(a=1,b=2,c=3,d=4)
b[,.SD,.SDcols=c(1:2)]

It's a bit of a hassle, but you don't lose out on other data.table features (I don't think), so you should still be able to use other important functions like join tables etc.

Upvotes: 46

Select multiple columns in data.table by their numeric indices

Answers (5)

Related Questions