Tony2016
Tony2016

Reputation: 267

Equi-join and rolling join in data.table

I'm looking to join tables on two fields, one using an equi-join and another using a rolling join. The data I'm using is below:

library(data.table)
dt <- data.table(Date = as.Date(c("2015-12-29", "2015-12-29", "2015-12-29", "2015-12-29", "2016-01-30", "2016-01    -30", "2016-01-30", "2016-01-30", "2016-02-29", "2016-02-29", "2016-02-29", "2016-02-29", "2016-03-26", "2016-03-26", "2016-03-26", "2016-03-26")), 
                   ID = c("A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D"), 
                Value = c("A201512", "B201512", "C201512", "D201512", "A201601", "B201601", "C201601", "D201601", "A201602", "B201602", "C201602", "D201602", "A201603", "B201603", "C201603", "D201603"), key = c('Date', 'ID'))

dtes <- data.table(Date=as.Date(c("2015-12-31", "2016-01-31", "2016-02-29", "2016-03-31")), key="Date")

dte <- CJ(Date=dtes$Date, ID=unique(dt$ID))

I want to join the tables 'dt' and 'dte' on ID (using equi-join) AND Date (using rolling join)

dt[dte, roll=T]

gives me

#           Date ID   Value
#  1: 2015-12-31  A      NA
#  2: 2015-12-31  B      NA
#  3: 2015-12-31  C      NA
#  4: 2015-12-31  D      NA
#  5: 2016-01-31  A      NA
#  6: 2016-01-31  B      NA
#  7: 2016-01-31  C      NA
#  8: 2016-01-31  D      NA
#  9: 2016-02-29  A A201602
# 10: 2016-02-29  B B201602
# 11: 2016-02-29  C C201602
# 12: 2016-02-29  D D201602
# 13: 2016-03-31  A      NA
# 14: 2016-03-31  B      NA
# 15: 2016-03-31  C      NA
# 16: 2016-03-31  D      NA

The result I am after is this:

# Date      ID        Value
# 2016-03-31    A   A201603
# 2016-02-29    A   A201602
# 2016-01-31    A   A201601
# 2015-12-31    A   A201512
# 2016-03-31    B   B201603
# 2016-02-29    B   B201602
# 2016-01-31    B   B201601
# 2015-12-31    B   B201512
# 2016-03-31    C   C201603
# 2016-02-29    C   C201602
# 2016-01-31    C   C201601
# 2015-12-31    C   C201512
# 2016-03-31    D   D201603
# 2016-02-29    D   D201602
# 2016-01-31    D   D201601
# 2015-12-31    D   D201512

Is this possible in data.table?

Upvotes: 4

Views: 412

Answers (1)

Frank
Frank

Reputation: 66819

Yeah, set the keys in the reverse order; the roll goes on the final column being merged:

setkey(dt, ID, Date)
setkey(dte, ID, Date)
dt[dte, roll=TRUE][order(ID, -Date)]


          Date ID   Value
 1: 2016-03-31  A A201603
 2: 2016-02-29  A A201602
 3: 2016-01-31  A A201601
 4: 2015-12-31  A A201512
 5: 2016-03-31  B B201603
 6: 2016-02-29  B B201602
 7: 2016-01-31  B B201512
 8: 2015-12-31  B B201512
 9: 2016-03-31  C C201603
10: 2016-02-29  C C201602
11: 2016-01-31  C C201601
12: 2015-12-31  C C201512
13: 2016-03-31  D D201603
14: 2016-02-29  D D201602
15: 2016-01-31  D D201601
16: 2015-12-31  D D201512

Or, instead of setkey, just use X[Y, on=cols, roll=TRUE] writing cols in the correct order (assuming the bug mentioned in comments above is fixed).

Upvotes: 1

Related Questions