Reputation: 1
I have two tables in R similar to the following:
df1 lat long date 1.1 2.3 12-4-70 3.3 7.3 5-5-80 1.1 2.3 7-2-90 df2 lat long date 1.1 2.3 6-12-82 3.3 2.4 6-10-83 8.4 7.3 8-19-88
I want to select all rows from df1 that have row in df2 where the lat and long both match and the date in df1 is less than the date in df2. Given the tables above, my desired output would be:
filtered_df1
lat long date
1.1 2.3 12-4-70
Upvotes: 0
Views: 721
Reputation: 887118
Another option is a non-equi join with data.table
library(data.table)
setDT(df1)[, date := as.IDate(date, "%m-%d-%y")
setDT(df2)[, date := as.IDate(date, "%m-%d-%y")
df1[df2, on = .(lat, long, date < date)]
Upvotes: 1
Reputation: 24790
This is called a non-equi join. You can use the fuzzyjoin
package to do this with dplyr
:
library(fuzzyjoin)
library(lubridate)
df1 <- df1 %>% mutate(date = mdy(date))
df2 <- df2 %>% mutate(date = mdy(date))
fuzzy_inner_join(df1, df2,
by = c("lat" = "lat", "long" = "long", "date" = "date"),
match_fun = list(`==`,`==`,`<`))
# A tibble: 1 x 6
lat.x long.x date.x lat.y long.y date.y
<dbl> <dbl> <date> <dbl> <dbl> <date>
1 1.1 2.3 1970-12-04 1.1 2.3 1982-06-12
Upvotes: 1