R data.table J behavior

Question

I am still puzzled by the behavior of data.table J.

> DT = data.table(A=7:3,B=letters[5:1])
> DT
   A B
1: 7 e
2: 6 d
3: 5 c
4: 4 b
5: 3 a
> setkey(DT, A, B)

> DT[J(7,"e")]
   A B
1: 7 e

> DT[J(7,"f")]
   A B
1: 7 f  # <- there is no such line in DT

but there is no such line in DT. Why do we get this result?

MattLBeck · Accepted Answer

The data.table J(7, 'f') is literally a single-row data.table that you are joining your own data.table with. When you call x[i], you are looking at each row in i and finding all matches for this in x. The default is to give NA for rows in i that don't match anything, which is easier seen by adding another column to DT:

DT <- data.table(A=7:3,B=letters[5:1],C=letters[1:5])
setkey(DT, A, B)
DT[J(7,"f")]
#    A B  C
# 1: 7 f NA

What you are seeing is the only row in J with no match to anything in DT. To prevent data.table from reporting non-matches, you can use nomatch=0

DT[J(7,"f"), nomatch=0]
# Empty data.table (0 rows) of 3 cols: A,B,C

R data.table J behavior

Answers (2)

Related Questions