Date merging between range and single dates

Question

I have a dataset datA that looks as

ID     LINE_ID     Fromdate     Todate     UniqueID
1      1           1/1/2015     2/3/2015   11
1      2           3/7/2015     3/9/2015   12
2      1           2/2/2015     2/8/2015   21
3      1           1/3/2013     1/8/2013   31

And a datB that looks like

ID    LINE_ID     Date     UniqueID
1     1           1/1/2015 11
1     2           1/3/2015 12
1     3           2/2/2015 13
1     4           2/8/2015 14
1     5           3/8/2015 15
2     1           2/2/2015 21
2     2           2/3/2015 22
2     3           2/7/2015 23
2     4           2/8/2015 24
3     1           1/3/2013 31
3     2           1/7/2013 32
3     3           1/8/2013 33
3     4           1/9/2013 34

What I want to do is find some way to either combine both datasets or find some way to tag which UniqueID in datA belongs to what line in datB. For example, using the two datasets above, I would like to have datC look like datB with an added column for which UniqueID in datA corresponds.

ID    LINE_ID     Date     UniqueID     UniqueID.A
1     1           1/1/2015 11           11
1     2           1/3/2015 12           11
1     3           2/2/2015 13           11
1     4           2/8/2015 14           NA
1     5           3/8/2015 15           12
2     1           2/2/2015 21           21
2     2           2/3/2015 22           21
2     3           2/7/2015 23           21
2     4           2/8/2015 24           21
3     1           1/3/2013 31           31
3     2           1/7/2013 32           31
3     3           1/8/2013 33           31
3     4           1/9/2013 34           NA

As one can see, the new column added to datB is which UniqueID in datA does the date range in datB fall within.

Does anyone have any guidance on how to do this in R?

Pierre L · Accepted Answer

To help with the implementation. Be sure that columns are the correct types and keys are set:

#Create a start and end columns formatted to dates
setDT(datB)[, ':='(c("start", "end"), as.IDate(Date, format="%m/%d/%Y"))]

#Format columns to dates
setDT(datA)[, ':='(c("Fromdate", "Todate"), 
                   lapply(.SD, as.IDate, format="%m/%d/%Y")),
            .SDcols=c("Fromdate", "Todate")]

#Set keys for matching intervals on
setkey(datA, Fromdate, Todate)
setkey(datB, start, end)

#Match on intervals
foverlaps(datB, datA, type="within")

Date merging between range and single dates

Answers (2)

Related Questions