Reputation: 97
i have this datasets
set.seed(1)
df1<- data.frame(
user = as.factor(rep(c("mike","john","david", "gabriel"), each =4)),
trx_date = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 16)
)
df2<- data.frame(
user = as.factor(c("mike","john","david")),
filter_date= as.Date(c("1999-07-29", "1999-03-08", "1999-10-24"))
how do i filter any trx_date
in df1
which happen after filter_date
in df2
per user
?
Upvotes: 0
Views: 26
Reputation: 887118
In base R
, we can use merge
with subset
subset(merge(df1, df2, by = 'user'), trx_date > filter_date)
Upvotes: 0
Reputation: 5766
Using the package dplyr
, you could do
library(dplyr)
full_join(df1, df2, by=c('user')) %>%
group_by(user) %>%
filter(trx_date >= filter_date)
But what do you want to do with "gabriel"? It does not exist in df2, so how should that be filtered? With the above solution, it is lost. If you want to keep it, replace filter
with filter(trx_date >= filter_date | is.na(filter_date))
. (Note the use of a single |
as opposed to the usual ||
)
Upvotes: 1
Reputation: 388982
You can join the two dataframes and then filter
:
library(dplyr)
df1 %>%
inner_join(df2, by = 'user') %>%
filter(trx_date > filter_date)
Upvotes: 1