Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

Filter by group and conditions

I have this type of data, where Sequis a grouping variable:

df <- data.frame(
  Sequ = c(1,1,1,
           2,2,2,
           3,3,
           4,4),
  Answerer = c("A", NA, NA, "A", NA, NA, "B", NA, "C", NA),
  PP_by = c(rep("A",5), rep("B",5)),
  pp = c(0.1,0.2,0.3, 1, NA, NA, NA, NA, NA, NA)
)

I need to remove any Sequ where

I've tried this, but it obviously implements just the first condition (i):

library(dplyr)
df %>%
  group_by(Sequ) %>%
  filter(
         all(!is.na(pp))
         )

The expected result is:

   Sequ Answerer PP_by  pp
1     1        A     A 0.1
2     1     <NA>     A 0.2
3     1     <NA>     A 0.3
9     4        C     B  NA
10    4     <NA>     B  NA

EDIT:

I've come up with this solution:

df %>%
  group_by(Sequ) %>%
  filter(
    first(Answerer) != first(PP_by)
    |
    all(!is.na(pp))
  )

Upvotes: 1

Views: 34

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

Here's another way:

df %>%
  group_by(Sequ) %>%
  filter(!(
    any(Answerer == PP_by, na.rm = TRUE) &
      any(is.na(pp))
  ))
# # A tibble: 5 × 4
# # Groups:   Sequ [2]
#    Sequ Answerer PP_by    pp
#   <dbl> <chr>    <chr> <dbl>
# 1     1 A        A       0.1
# 2     1 NA       A       0.2
# 3     1 NA       A       0.3
# 4     4 C        B      NA  
# 5     4 NA       B      NA  

Upvotes: 3

Related Questions