Extract the row with the earliest date if it meets multiple conditions in R

Question

I would like to filter my dataset Based on the following conditions:

Just for Disease = ABC, and Class = Colonized and the First_Name and Last_Name match, keep the row with the the earliest Spec_Date only.
Keep all other rows with all other diseases no matter what Class they have and all rows with ABC disease and clinical Class.

Disease<-c("ABC","ABC","CRE","MCA","ABC","ABC","CRE","MCA") Class<-c("Colonized","Colonized","Clinical","Clinical","Colonized","Clinical","Clinical","Clinical") First_Name<-c("Roger","Roger","John","John","Roger","Mary","James","Lee") Last_Name<-c("Smith","Smith","Doe","Doe","Smith","Poppins","Bond","Majors") Spec_Date<-as.Date(c("2001-01-01","2002-01-01","2003-01-01","2003-01-01","2004-01-01","2001-01-01","2003-01-01","2004-01-01")) df<-data.frame(Disease,Class,First_Name,Last_Name,Spec_Date)

The resulting dataset should be as below:

Disease<-c("ABC","CRE","MCA","ABC","CRE","MCA")
Class<-c("Colonized","Clinical","Clinical","Clinical","Clinical","Clinical")
First_Name<-c("Roger","John","John","Mary","James","Lee")
Last_Name<-c("Smith","Doe","Doe","Poppins","Bond","Majors")
Spec_Date<-as.Date(c("2001-01-01","2003-01-01","2003-01-01","2001-01-01","2003-01-01","2004-01-01"))
df2<-data.frame(Disease,Class,First_Name,Last_Name,Spec_Date)

Any help is really appreciated.

asd-tm · Accepted Answer

Do you mean this?

EDIT

library(dplyr)
    
union_all(
  df %>% 
    filter(Disease != "ABC" | Class != "Colonized"),
  df %>% 
    filter(Disease == "ABC" & Class == "Colonized") %>% 
    group_by(First_Name, Last_Name) %>% 
    summarise(Disease = "ABC",
              Class = "Colonized",
              Spec_Date = min(Spec_Date) %>% as.Date())
)

    
  Disease     Class First_Name Last_Name  Spec_Date
1     CRE  Clinical       John       Doe 2003-01-01
2     MCA  Clinical       John       Doe 2003-01-01
3     ABC  Clinical       Mary   Poppins 2001-01-01
4     CRE  Clinical      James      Bond 2003-01-01
5     MCA  Clinical        Lee    Majors 2004-01-01
6     ABC Colonized      Roger     Smith 2001-01-01

Extract the row with the earliest date if it meets multiple conditions in R

Answers (2)

Related Questions