Reputation: 1601
I have a data frame with two groups, the years they were eligible for a service, and a flag indicating participation (each group will participate in one year and one year only, but not all groups participate).
I would like to filter the data frame as follows:
I know that I can do this piecemeal by splitting the data frame into groups that participated vs. those that didn't and then bind them back together, but I'm wondering whether there is anyway to do this in a single pipe?
library(dplyr)
dat <- data.frame(grp = c("a", "a", "a", "b", "b"),
yr = c(2017, 2018, 2019, 2017, 2018),
par = c(0, 1, 0, 0, 0))
## this works because group 'a' participated
dat %>%
group_by(grp) %>%
filter(grp == "a") %>%
filter(yr <= yr[par == 1])
## here i am trying to use conditional filtering but failing
dat %>%
group_by(grp) %>%
filter(yr <= case_when(max(par) == 1 ~ yr[par == 1],
TRUE ~ max(yr)))
Upvotes: 1
Views: 131
Reputation: 21908
I think this may help you but I would've liked to test it on a larger data set:
library(dplyr)
dat %>%
group_by(grp) %>%
mutate(cumx = cumsum(cummax(par))) %>%
filter(cumx %in% c(0,1) | all(cumx == 0)) %>%
select(-cumx)
# A tibble: 4 x 3
# Groups: grp [2]
grp yr par
<chr> <dbl> <dbl>
1 a 2017 0
2 a 2018 1
3 b 2017 0
4 b 2018 0
Upvotes: 1