Let's Code
Let's Code

Reputation: 99

Match two files with recency of date in r

I have the data as follows:

FileA <- data.frame(
         Item = c(101,102,103,104,105),
         Date = as.Date(c("2020-08-17","2020-07-15","2020-06-17","2020-04-17","2020-04-12")),
         Price =c(5,6,7,8,9))
FileB <- data.frame(
  Item = c(101,101,102,103,103),
  Date = as.Date(c("2020-08-01","2020-08-16","2020-05-17","2020-03-10","2019-05-12")),
  Price =c(15,16,17,18,19))

Item 101 ,102 and 103 are both present in two different data frames. I want to match from FileA to FileB based on item and the most recent date to the same item is FileB.

For example: 'Item 101' from FileA should be matched to 'Item 101' in FileB with date "2020-08-16"(as it is more recent to the date for 'Item 101' in File'A') and not matched to "2020-08-01".

The output should be like: enter image description here

Upvotes: 0

Views: 65

Answers (1)

ismirsehregal
ismirsehregal

Reputation: 33417

Here is a data.table approach using roll = "nearest":

FileA <- data.frame(
  Item = c(101,102,103,104,105),
  Date = as.Date(c("2020-08-17","2020-07-15","2020-06-17","2020-04-17","2020-04-12")),
  Price =c(5,6,7,8,9))
FileB <- data.frame(
  Item = c(101,101,102,103,103),
  Date = as.Date(c("2020-08-01","2020-08-16","2020-05-17","2020-03-10","2019-05-12")),
  Price =c(15,16,17,18,19))


library(data.table)
setDT(FileA)
setDT(FileB)

FileB[, Date_B := Date][, Item_Code_B := Item]
resultDT <- na.omit(FileB[FileA, on = c("Item", "Date"), roll = "nearest"])

setnames(resultDT, c("Item", "Date", "Price", "i.Price"), c("Item_Code_A", "Date_A", "Price_A", "Price_B"))
setcolorder(resultDT, c("Item_Code_A", "Date_A", "Price_A", "Item_Code_B", "Date_B", "Price_B"))

resultDT

> resultDT
   Item_Code_A     Date_A Price_A Item_Code_B     Date_B Price_B
1:         101 2020-08-17       5         101 2020-08-16      16
2:         102 2020-07-15       6         102 2020-05-17      17
3:         103 2020-06-17       7         103 2020-03-10      18

You can use setDF(resultDT) to convert the result back to a "normal" data.frame.

Upvotes: 1

Related Questions