Reputation: 1681
I have two dataframes df.o
and df.m
as defined below. I need to find which observation in df.o
(dimension table) corresponds which observations in df.m
(fact table) based on two criteria: 1) df.o$Var1==df.o$Var1
and df.o$date1 < df.m$date2 < df.o$date3
such that I get the correct value of df.o$oID
in df.m$oID
(the correct value is manually entered in df.m$CORRECToID
). I need the ID to complete a merge afterwards.
df.o <- data.frame(oID=1:4,
Var1=c("a","a","b","c"),
date3=c(2015,2011,2014,2015),
date1=c(2013,2009,2012,2013),
stringsAsFactors=FALSE)
df.m <- data.frame(mID=1:3,
Var1=c("a","a","b"),
date2=c(2014,2010,2013),
oID=NA,
CORRECToID=c(1,2,3),
points=c(5, 10,15),
stringsAsFactors=FALSE)
I have tried various combinations of like the code below, but without luck:
df.m$oID[df.m$date2 < df.o$date3 & df.m$date2 > df.o$date1 & df.o$Var1==df.m$Var1] <- df.o$oID
I have also tried experimenting with various combinations of ifelse
, which
and match
, but none seem to do the trick.
The problem I keep encountering is that my replacement was a different number of rows than data and that "longer object length is not a multiple of shorter object length".
Upvotes: 1
Views: 119
Reputation: 92302
What you are looking for is called an "overlap join", you could try the data.table::foverlaps
function in order to achieve this.
The idea is simple
df.m
)run foverlaps
and select the column you want back
library(data.table)
setkey(setDT(df.m)[, date4 := date2], Var1, date2, date4)
setkey(setDT(df.o), Var1, date1, date3)
foverlaps(df.m, df.o)[, names(df.m), with = FALSE]
# mID Var1 date2 oID CORRECToID points date4
# 1: 2 a 2010 2 2 10 2010
# 2: 1 a 2014 1 1 5 2014
# 3: 3 b 2013 3 3 15 2013
Upvotes: 2