Reputation: 385
I'd like to filter one data frame ('data'
) based on several values in a different data frame ('key'
).
My 'key'
looks like this
exhibit.name <- c("lions", "otters", "penguins")
exhibit.start <- c(as.Date("2016-04-01"), as.Date("2016-05-01"), as.Date("2016-06-01"))
exhibit.end <- c(as.Date("2016-04-30"), as.Date("2016-05-31"), as.Date("2016-06-30"))
key <- data_frame(exhibit.name, exhibit.start, exhibit.end)
And my 'data'
looks like this
exhibit.name <- c("lions", "lions", "otters",
"otters", "penguins", "penguins")
exhibit.date <- c(as.Date("2016-04-15"), as.Date("2016-12-15"), as.Date("2016-05-15"),
as.Date("2016-02-15"), as.Date("2016-06-15"), as.Date("2016-10-15"))
data <- data_frame(exhibit.name, exhibit.date)
I need to filter 'data'
to return rows where data$exhibit.name
match key$exhibit.name
AND whose data$exhibit.date
fall within the related key$exhibit.start
and key$exhibit.end
date. The resulting data frame would look like this:
> valid.exhibits
1|lions |2016-04-15
2|otters |2016-05-15
3|penguins|2016-06-15
Thanks!
Upvotes: 4
Views: 129
Reputation: 887118
We can do a left_join
and then filter
data %>%
left_join(., key) %>%
filter(exhibit.start < exhibit.date, exhibit.end > exhibit.date) %>%
select(1:2)
# exhibit.name exhibit.date
# <chr> <date>
#1 lions 2016-04-15
#2 otters 2016-05-15
#3 penguins 2016-06-15
We can also use the non-equi (conditional joins from the developmental version of data.table) i.e. v1.9.7+
library(data.table)
setDT(key)
setDT(data)[key, on = .(exhibit.name, exhibit.date > exhibit.start,
exhibit.date < exhibit.end), new := 1]
na.omit(data)[, new := NULL][]
# exhibit.name exhibit.date
#1: lions 2016-04-15
#2: otters 2016-05-15
#3: penguins 2016-06-15
Upvotes: 4