rnorouzian
rnorouzian

Reputation: 7517

Conditional subsetting from dataframe with multiple conditions

In my data.frame below, I want to subset all rows for which group == 0 AND all 3 years of 2017, 2018, and 2019 are also included.

Desired output in the example below is the information on rows 4, 5, and 6.

I have tried the following solution with no success. Is there a quick fix in BASE R?

dat <- data.frame(group = c(0,0,1, 0,0,0, 1, 1, 1), 
       year = rep(2017:2019, 3))

subset(dat, group == 0 & year == 2017 & year == 2018 & year == 2019)

Upvotes: 3

Views: 176

Answers (1)

akrun
akrun

Reputation: 886938

If the OP wanted to treat the 'group' adjacent unique

library(dplyr)
library(data.table)
dat %>%
   group_by(grp = rleid(group)) %>%
   filter(all(2017:2019 %in% year), group == 0) %>%
   ungroup %>%
   select(-grp)
# A tibble: 3 x 2
#  group  year
#  <dbl> <int>
#1     0  2017
#2     0  2018
#3     0  2019

Or in base R with rle

grp <- with(rle(dat$group), rep(seq_along(values), lengths))
subset(dat, as.logical(ave(year,  grp, FUN = 
    function(x) all(2017:2019 %in% x)) ) & group == 0)
#  group year
#4     0 2017
#5     0 2018
#6     0 2019

Upvotes: 2

Related Questions