cosmia1
cosmia1

Reputation: 129

subset data.table based on key being NOT an element of a list

I have the following data.table:

DT = data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))

DT
   ID  A
1:  1 13
2:  2  1
3:  4 13
4:  5 11
5: 10 12

The contents of column A are not important. I have a list/vector test <- c(1, 5, 9, 10, 11, 12, ...) that can be many times longer than the data.table. I want to select the rows in the data.table DT such that the key ID is not present in the vector test:

    ID  A
2:  2  1
3:  4 13

I think that DT[!(ID %in% test)] works, but wanted to take advantage of the data.table fast key-based subsetting. Note that the vector test could possibly not have any elements in common with the key from DT, which would lead to the subset returning the data.table itself, and it could be that all keys are present in test, returning an empty data.table. Any suggestions?

Upvotes: 0

Views: 1218

Answers (2)

Waldi
Waldi

Reputation: 41210

What about :

library(data.table)
DT   <- data.table(ID = c(1, 2, 4, 5, 10), A = c(13, 1, 13, 11, 12))
test <- data.table(ID = c(1, 5, 9, 10, 11, 12))
setkey(test,ID)
DT[!test, on="ID"]

Upvotes: 3

akrun
akrun

Reputation: 886938

We can use %in% and negate (!)

DT[!ID %in% test]

Upvotes: 0

Related Questions