Phil.He
Phil.He

Reputation: 154

Filtering in a group over multiple rows in R

I have a data frame looking like this with ID and two conditions:

df <- data.frame(ID=c("A", "A", "A", "B", "C"), cond1=c("yes", "yes", "no", "no", "yes"), cond2=c("no", "no", "yes", "yes", "yes"))
df
   ID cond1 cond2
1  A   yes    no
2  A   yes    no
3  A    no   yes
4  B    no   yes
5  C   yes   yes

I want to filter IDs for which cond1 and cond2 applies (=yes), but it does not have to be true in the same row. This means that I would like to filter out ID A and C, but not B.

I have tried to group_by() and filter() using dplyr:

df %>%
group_by(PID)%>%
filter(cond1==yes&cond2==yes)

But this only filters out a PID/row in which cond 1 and cond 2 both apply (so only ID C/row 5).

What can I do to filter PIDs/rows by group with two conditions that both have t0 be true within one group, but not within one row?

Thank you!

Upvotes: 2

Views: 937

Answers (3)

akrun
akrun

Reputation: 887901

Using base R

subset(df, ave(cond1 == 'yes', ID, FUN = any) &
            ave(cond2 == 'yes', ID, FUN = any))

-output

 ID cond1 cond2
1  A   yes    no
2  A   yes    no
3  A    no   yes
5  C   yes   yes

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 102710

A data.table option

> setDT(df)[, .SD[all(colSums(.SD == "yes") > 0)], ID]
   ID cond1 cond2
1:  A   yes    no
2:  A   yes    no
3:  A    no   yes
4:  C   yes   yes

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389275

You can use any -

library(dplyr)

df %>%
  group_by(ID) %>%
  filter(any(cond1 == 'yes') && any(cond2 == 'yes')) %>%
  ungroup

#  ID    cond1 cond2
#  <chr> <chr> <chr>
#1 A     yes   no   
#2 A     yes   no   
#3 A     no    yes  
#4 C     yes   yes  

Upvotes: 6

Related Questions