statquant
statquant

Reputation: 14370

how can I select all rows of a table that match those in another table

I am trying to do something very simple with data.table and I lost the idiomatic way to do it

library(data.table)
set.seed(1)
DT = data.table(a=sample(letters,1e5,T), b=sample(letters,1e5,T), c=rnorm(1e5))
DT2 = data.table(a=sample(letters,5,T), b=sample(letters,5,T)) 

DT2
   a b
1: k h
2: e v
3: f n
4: m q
5: w v

I want to select the rows of DT that match those of DT2. As such the number of rows after operation will always be smaller that the initial table.

I want something doing this:

> DT[paste(a,b) %chin% DT2[,paste(a,b)]]
     a b          c
  1: m q -0.4974579
  2: e v -0.1325602
  3: w v -1.8081050
  4: m q  0.9025120
  5: w v -0.4958802
 ---               
729: f n  0.5604650
730: f n -1.2607321
731: m q  0.5146013
732: m q -1.8329656
733: k h -0.9752011
> DT2[paste(a,b) %chin% DT[,paste(a,b)]]
   a b
1: e v
2: f n
3: k h
4: m q
5: w v
> 

Upvotes: 2

Views: 50

Answers (1)

BrodieG
BrodieG

Reputation: 52637

An inner join should do:

setkey(DT, a, b)[DT2, nomatch=0]

Produces:

     a b          c
  1: k h -1.6592442
  2: k h  1.1946471
  3: k h -0.8694933
  4: k h  0.7789158
  5: k h -1.3142607
 ---               
729: w v -0.3516787
730: w v  0.5272145
731: w v -0.7531717
732: w v  0.3352228
733: w v  0.1182353

If you want to know which values in DT2 exist in DT then:

unique(setkey(DT[, .(a, b)], a, b))[DT2, nomatch=0]

Upvotes: 3

Related Questions