Reputation: 391
I'm trying to merge 2 data frames that look like this:
library(data.table)
#transactions
colNames<-c("id","tran")
df2 <- data.table(c("010","010","030","210","310","050"), as.Date(c("2012-12-28","2014-01-01","2011-07-05","2015-04-05","2013-07-05","2012-08-01")))
names(df2) <- colNames
#status change
colNames<-c("id","status")
df1 <- data.table(c("010","010","010","030","030","210","210","310","050"),
as.Date(c("2012-10-28","2013-11-01","2014-01-01","2011-05-09","2011-08-04","2013-07-06","2015-01-01","2013-05-04","2010-09-10")))
names(df1) <- colNames
df3
id tran status
1: 010 2012-12-28 2012-10-28
2: 010 2014-01-02 2014-01-01
3: 030 2011-07-05 2011-05-09
4: 210 2015-04-05 2015-01-01
5: 310 2013-07-05 2013-05-04
6: 050 2012-08-01 2010-09-10
Essentially, all transactions happen at some point after a status change. I'm trying to merge all transactions with their appropriate status change for each ID. The tricky part is that the dates are almost never the same. I need the status change date that it was for each transaction...
I'm looking at ?merge but I don't see how it can do something like that. Maybe ?aggregate but how would it know that the aggregation is conditional on another data frame?
Thanks!
Upvotes: 2
Views: 453
Reputation: 92282
You could simply perform rolling join.
df2[df1, status := i.status, on = .(id, tran = status), roll = -Inf]
df2
# id tran status
# 1: 010 2012-12-28 2012-10-28
# 2: 010 2014-01-01 2014-01-01
# 3: 030 2011-07-05 2011-05-09
# 4: 210 2015-04-05 2015-01-01
# 5: 310 2013-07-05 2013-05-04
# 6: 050 2012-08-01 2010-09-10
roll = -Inf
just means that for each incident in df2
, we want to match the closest lower incident in df1
no matter how far it is.status := i.status
means that we want to create a column called status
in df2
by reference (in place) while borrowing the matching values from df1$status
. i.
states for the column from the table in the i
th locating in df2
Upvotes: 2