Filter Groups within dataframe where all element are within a list in R

Question

I have a list such as the_list <- c("SP1","SP")

And I have a dataframe such as :

Groups Names 
G1     SP1
G1     SP2
G1     SP3
G1     SP4
G1     SP5
G2     SP1
G2     SP4
G2     SP5
G3     SP6
G3     SP7
G3     SP8
G4     SP1
G4     SP2
G4     SP7

And I would like to only keep Groups where ALL elements in the_list are present within the Names column.

I should then get :

   Groups Names 
    G1     SP1
    G1     SP2
    G1     SP3
    G1     SP4
    G1     SP5
    G4     SP1
    G4     SP2
    G4     SP7

So far I tried:

df <-df %>%
  group_by(Groups) %>%
  filter(all(Names %in% c('SP1','SP2')))

Salix · Accepted Answer

You almost have it. The problem is that the current syntax is asking "are all the values in the column 'Names' in c('SP1','SP2')?" instead of "are all the values in c('SP1','SP2') in the column 'Names'?".

So you just want to inverse the left and right hand side of the %in% like :

df %>%
  group_by(Groups) %>%
  filter(all(c('SP1','SP2') %in% Names))

And that will give you :

# # A tibble: 8 x 2
# # Groups:   Groups [2]
# Groups Names
#   
# 1 G1     SP1  
# 2 G1     SP2  
# 3 G1     SP3  
# 4 G1     SP4  
# 5 G1     SP5  
# 6 G4     SP1  
# 7 G4     SP2

Filter Groups within dataframe where all element are within a list in R

Answers (1)

Related Questions