user3211060
user3211060

Reputation: 1

In R filter rows of one table based off of matches to another table AND date less than other table

I have two tables in R similar to the following:

df1
lat long  date
1.1  2.3  12-4-70
3.3  7.3  5-5-80
1.1  2.3  7-2-90

df2
lat  long date
1.1  2.3  6-12-82
3.3  2.4  6-10-83
8.4  7.3  8-19-88

I want to select all rows from df1 that have row in df2 where the lat and long both match and the date in df1 is less than the date in df2. Given the tables above, my desired output would be:

filtered_df1
lat  long  date
1.1  2.3   12-4-70

Upvotes: 0

Views: 721

Answers (2)

akrun
akrun

Reputation: 887118

Another option is a non-equi join with data.table

library(data.table)
setDT(df1)[, date := as.IDate(date, "%m-%d-%y")
setDT(df2)[, date := as.IDate(date, "%m-%d-%y")
df1[df2, on = .(lat, long, date < date)]

Upvotes: 1

Ian Campbell
Ian Campbell

Reputation: 24790

This is called a non-equi join. You can use the fuzzyjoin package to do this with dplyr:

library(fuzzyjoin)
library(lubridate)
df1 <- df1 %>% mutate(date = mdy(date))
df2 <- df2 %>% mutate(date = mdy(date))

fuzzy_inner_join(df1, df2, 
                 by = c("lat" = "lat", "long" = "long", "date" = "date"),
                 match_fun = list(`==`,`==`,`<`))
# A tibble: 1 x 6
  lat.x long.x date.x     lat.y long.y date.y    
  <dbl>  <dbl> <date>     <dbl>  <dbl> <date>    
1   1.1    2.3 1970-12-04   1.1    2.3 1982-06-12

Upvotes: 1

Related Questions