Filter data.table on multiple criteria in the same column

Question

I have the following data.table:

> dt= data.table(num=c(1,2,1,1,2, 3, 3,2), letters[1:8])
> dt
   num V2
1:   1  a
2:   1  c
3:   1  d
4:   2  b
5:   2  e
6:   2  h
7:   3  f
8:   3  g

I want to filter all num equals to 1 and 2 and get the resulting data.table. I can do this with:

> dt[num==1 | num==2,]
   num V2
1:   1  a
2:   1  c
3:   1  d
4:   2  b
5:   2  e
6:   2  h

Or:

rbind(setkey(dt, num)[J(1)],setkey(dt, num)[J(2)])

But is there any option with setkey so that the second expression is shorter like:

setkey(dt, num)[1|2]

Since setkey code is quicker for very large amount ... I would appreciate some help!

ctbrown · Accepted Answer

In additions to KFB's comment:

setkey(dt, num)[num %in% c(1,2)]

If the filtering values are integers in a sequence:

setkey(dt,num)[J(1:2)]    # OR 
setkey(dt,num)[seq]

If they are arbitrary:

setkey(dt,num)[J(c(1,2)]

NOTE 1: This may not work in older versions of data.table NOTE 2: . is a alias for J which is more readable:

 setkey(dt,num)[.(1:2)]

FWIW, I like using the magrittr package with data.table and making everything is clear as possible:

dt %>% setkey(num)
dt[ .(1:2) ]

The drawback is that you can't do this neatly on one line.

Filter data.table on multiple criteria in the same column

Answers (1)

Related Questions