R - Exclude rows with events by date

Question

I have a dataframe with ID's, entry dates, events (X1.0 and X1.1) and the dates corresponding to event times (X2.0 and X2.1). I need to find ID's of people who had a specific event before the entry date.

library(dplyr)

df <- data.frame(ID = as.integer(c(10001, 10002, 10004, 10005, 10006)),
                 entry_date = as.Date(c("2013-12-22", "2013-02-13", "2013-03-13", "2013-01-01", "2013-02-02")), 
                 X1.0 = as.integer(c(150, NA, 120, NA, 150)),
                 X1.1 = as.integer(c(120, 180, NA, 150, 120)),
                 X2.0 = as.Date(c("2017-02-13", NA, "2014-06-04", NA, "1990-03-08")), 
                 X2.1 = as.Date(c("2003-12-22", "2011-02-13", NA, "1998-01-01", "2015-02-01")))

#filtering columns that begin with X1 by event 150

> dplyr::filter_at(df, vars(starts_with("X1")), any_vars(. == 150)) 

     ID entry_date X1.0 X1.1       X2.0       X2.1
1 10001 2013-12-22  150  120 2017-02-13 2003-12-22
2 10005 2013-01-01   NA  150        1998-01-01
3 10006 2013-02-02  150  120 1990-03-08 2015-02-01

My problem is filtering the dates conditionally because I'm only interested in filtering the "150" events that have occurred before the entry date, and there are other events present. End result should look like this:

     ID entry_date X1.0 X1.1       X2.0       X2.1
1 10005 2013-01-01   NA  150        1998-01-01
2 10006 2013-02-02  150  120 1990-03-08 2015-02-01

Ronak Shah · Accepted Answer

We can get the data in long format, getting event value and event date in separate columns. We can then select only those rows which have event value as 150 and entry_date greater than event date. We join this data back to df to get all the corresponding columns.

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = -c(ID, entry_date), 
                      names_to = c('.value', 'col'), 
                      names_sep = "\.") %>%
   filter(X1 == 150 & entry_date > X2) %>%
   select(ID) %>%
   left_join(df, by = 'ID')


#    ID entry_date  X1.0  X1.1 X2.0       X2.1      
#                   
#1 10005 2013-01-01    NA   150 NA         1998-01-01
#2 10006 2013-02-02   150   120 1990-03-08 2015-02-01

R - Exclude rows with events by date

Answers (2)

Related Questions