Cartesian Rolling Join using Data.table

Question

I have two tables:

dat: contains the data
dates: contains the table of dates

library(data.table)

dates = structure(list(date = structure(c(17562, 17590, 17621, 17651, 
                              17682, 17712, 17743, 17774, 17804, 17835, 17865, 17896), class = "Date")), 
      row.names = c(NA, -12L), class = "data.frame")


dat = structure(list(date = structure(c(17546, 17743, 17778, 17901, 
                              17536, 17806, 17901, 17981, 17532, 17722, 17969, 18234), class = "Date"), 
           country = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
                                 3L, 3L, 3L), .Label = c("AAA", "BBB", "CCC"), class = "factor"), 
           state = structure(c(1L, 1L, 2L, 3L, 4L, 1L, 2L, 5L, 6L, 1L, 
                               2L, 2L), .Label = c("S1", "S2", "S3", "S4", "S5", "S6"), class = "factor"), 
           item = structure(c(1L, 2L, 4L, 6L, 3L, 5L, 3L, 2L, 2L, 4L, 
                              5L, 7L), .Label = c("M1", "M2", "M3", "M4", "M5", "M6", "M7"
                              ), class = "factor"), value = c(67L, 10L, 50L, 52L, 93L, 
                                                              50L, 62L, 46L, 6L, 30L, 30L, 14L)), row.names = c(NA, -12L
                                                              ), class = "data.frame")


dates = data.table(dates)
dat = data.table(dat)


setkey(dates, date)
setkey(dat, date)

The result I'm after is below. I.e doing a rolling join with each individual row of dat and then combining the result.

rbind(
dat[1,][dates, roll = 90],
dat[2,][dates, roll = 90],
dat[3,][dates, roll = 90],
...
dat[12,][dates, roll = 90]
)

My actual dataset is much larger so it's no practical to list every row of dat. Is there a short hand way of doing the same thing without a loop?

SatZ · Accepted Answer

If I understand your intent correctly, you want to rollover the records for 90 days. I used a cross join and then used the rollover criteria to subset

Your original tables:

library(data.table)

dates = structure(list(date = structure(c(17562, 17590, 17621, 17651, 
                                          17682, 17712, 17743, 17774, 17804, 17835, 17865, 17896), class = "Date")), 
                  row.names = c(NA, -12L), class = "data.frame")


dat = structure(list(date = structure(c(17546, 17743, 17778, 17901, 
                                        17536, 17806, 17901, 17981, 17532, 17722, 17969, 18234), class = "Date"), 
                     country = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
                                           3L, 3L, 3L), .Label = c("AAA", "BBB", "CCC"), class = "factor"), 
                     state = structure(c(1L, 1L, 2L, 3L, 4L, 1L, 2L, 5L, 6L, 1L, 
                                         2L, 2L), .Label = c("S1", "S2", "S3", "S4", "S5", "S6"), class = "factor"), 
                     item = structure(c(1L, 2L, 4L, 6L, 3L, 5L, 3L, 2L, 2L, 4L, 
                                        5L, 7L), .Label = c("M1", "M2", "M3", "M4", "M5", "M6", "M7"
                                        ), class = "factor"), value = c(67L, 10L, 50L, 52L, 93L, 
                                                                        50L, 62L, 46L, 6L, 30L, 30L, 14L)), row.names = c(NA, -12L
                                                                        ), class = "data.frame")


dates = data.table(dates)
dat = data.table(dat)

Note, I haven't setkey.

I am using a cross join function from the reference: How to do cross join in R?

CJ.table.1 <- function(X,Y)
  setkey(X[,c(k=1,.SD)],k)[Y[,c(k=1,.SD)],allow.cartesian=TRUE][,k:=NULL]

Then I cross join, subset for the roll join, rename columns and sort

dsn1<-CJ.table.1(dat,dates)[i.date-date<=90 & i.date-date>=0][,.(date=i.date,country, state, item, value)][order(country, state, item, value,date),]

Cartesian Rolling Join using Data.table

Answers (2)

Related Questions