jpsmith
jpsmith

Reputation: 17175

Determine overlapping date intervals conditional on date type in R

I have a simple dataset containing the "date in" and "date out" of a healthcare facility, and date type (inpatient, outpatient, and infectious period) for each patient. I need to determine if a patient overlapped with an infectious period of another patient. I can typically do this using the lubridate package's interval and int_overlaps functions. My specific issue is when there are multiple infectious periods that do not overlap.

I am using R. Code to reproduce sample data and the figure is below.

I want to flag each visit with a logical T/F if it falls within the interval of an infectious period. The below figure may help visualize these data. Red rectangles are inpatient stays, and red circles are outpatient visits. The purple is the infectious period during a patient's inpatient stay. Only inpatient/outpatient visit that overlap with a purple interval should be flagged (i.e., with a logical TRUE or FALSE). Ideally the patient that is causing the infectious period would not be flagged (i.e. the long inpatient stay for K00005 would return F, but I can work around that if that causes complications.

Figure for SO

I have tried:

library(tidyverse); library(lubridate);

test <- have %>% mutate(Int=interval(datein, dateout),
                        overlaps=map(seq_along(Int), function(x){
                                      y=setdiff(seq_along(Int),x)
                                      return(any(int_overlaps(Int[x],Int[y])))
                                      }))

I feel like I'm close, but this seems to check each interval against all the intervals, not just the infectious periods, so they all come back TRUE. Is there a way to have it only check the intervals from the infectious period (either through modifying the above or with fresh code)?

I have search SO and have read several questions/responses that deal with similar issues, but none of them are solving this specific issue. Any help would be much appreciated!

library(tidyverse); library(lubridate);

have <- structure(list(id = c("K00005", "K52253", "K32022", "K20113", 
                              "K52253", "K00164", "K00164", "K10003", "K00347", "K00046", "K52253", 
                              "K00198", "K32022", "K00198", "K00685", "K00685", "K18122", "K00198", 
                              "K00347", "K00198", "K00198", "K32022", "K52135", "K34060", "K00164", 
                              "K04048", "K00135", "K32022", "K00685", "K00198", "K52253", "K30008", 
                              "K32022", "K32022", "K00347", "K00164", "K00135", "K00137", "K32022", 
                              "K32022", "K52253", "K00005", "K00046", "K00137"), 
                       datetype = c("Inpatient", "Outpatient", "Inpatient", "Outpatient", "Outpatient", "Outpatient", 
                                     "Outpatient", "Outpatient", "Outpatient", "Inpatient", "Outpatient", 
                                     "Inpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", 
                                     "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", 
                                     "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient", 
                                     "Outpatient", "Outpatient", "Outpatient", "Inpatient", "Outpatient", 
                                     "Outpatient", "Outpatient", "Outpatient", "Outpatient", "Inpatient", 
                                     "Inpatient", "Inpatient", "Outpatient", "Outpatient", "Outpatient", 
                                     "Infectious Period", "Infectious Period", "Infectious Period"), 
                       datein = structure(c(17542, 17544, 17556, 17559, 17586, 17602, 
                                             17608, 17623, 17626, 17626, 17641, 17642, 17651, 17657, 17659, 
                                             17661, 17664, 17668, 17668, 17675, 17675, 17676, 17681, 17685, 
                                             17699, 17703, 17712, 17712, 17713, 17719, 17721, 17739, 17739, 
                                             17742, 17745, 17746, 17755, 17760, 17768, 17768, 17788, 17542, 
                                             17626, 17760), class = "Date"), 
                       dateout = structure(c(17733, 17544, 17560, 17559, 17586, 17602, 17608, 17623, 17626, 17638, 
                                             17641, 17655, 17651, 17657, 17659, 17661, 17664, 17668, 17668, 
                                             17675, 17675, 17676, 17681, 17685, 17699, 17703, 17712, 17712, 
                                             17713, 17795, 17721, 17739, 17739, 17742, 17745, 17753, 17762, 
                                             17794, 17768, 17768, 17788, 17564, 17638, 17777), class = "Date"), 
                       color = c("#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", 
                                 "purple", "purple", "purple")), 
                  row.names = c(NA, -44L), class = c("tbl_df", "tbl", "data.frame"))

                      

require(vistime)
gg_vistime(have, 
           col.group="id", 
           col.event="datetype", 
           col.start="datein", 
           col.end="dateout", 
           col.color = "color", 
           show_labels = FALSE, 
           optimize_y = TRUE,
           #           linewidth = 5,
           title = "Figure for Stack Overflow Question")

Upvotes: 2

Views: 220

Answers (1)

Evan Friedland
Evan Friedland

Reputation: 3194

Here is a freshcode Base R approach (no libraries) using a very basic for loop. If the patient checked in while there was an infection (started_during), or left while there was an infection (ended_during), or was inpatient while a infectious period began and ended (in_during) it should flag the overlap as TRUE.

infectious_periods <- have[which(have$datetype=="Infectious Period"),]
have$overlap <- FALSE # initializes a new column

for(i in 1:nrow(have)){
  if(have$datetype[i] != "Infectious Period"){
    started_during <- any(have$datein[i] >= infectious_periods$datein & 
                            have$datein[i] <= infectious_periods$dateout)
    ended_during <- any(have$dateout[i] >= infectious_periods$datein & 
                          have$dateout[i] <= infectious_periods$dateout)
    in_during <- any(have$datein[i] >= infectious_periods$datein & 
                       have$dateout[i] <= infectious_periods$dateout)
    if(started_during | ended_during | in_during){
        have$overlap[i] <- TRUE
      }
  }
}
have$overlap
# A tibble: 44 x 6
#   id     datetype   datein     dateout    color   overlap
#   <chr>  <chr>      <date>     <date>     <chr>   <lgl>  
# 1 K00005 Inpatient  2018-01-11 2018-07-21 #DD4B39 TRUE   
# 2 K52253 Outpatient 2018-01-13 2018-01-13 #DD4B39 TRUE   
# 3 K32022 Inpatient  2018-01-25 2018-01-29 #DD4B39 TRUE   
# 4 K20113 Outpatient 2018-01-28 2018-01-28 #DD4B39 TRUE   
# 5 K52253 Outpatient 2018-02-24 2018-02-24 #DD4B39 FALSE  
# 6 K00164 Outpatient 2018-03-12 2018-03-12 #DD4B39 FALSE  
# 7 K00164 Outpatient 2018-03-18 2018-03-18 #DD4B39 FALSE  
# 8 K10003 Outpatient 2018-04-02 2018-04-02 #DD4B39 FALSE  
# 9 K00347 Outpatient 2018-04-05 2018-04-05 #DD4B39 TRUE   
#10 K00046 Inpatient  2018-04-05 2018-04-17 #DD4B39 TRUE  
# ... with 34 more rows

If this doesn't solve your needs let me know. There is definitely more that could be done like a count of the other infectious periods the patient overlaps with, but this should get you started.

pic

Code for pic

library(ggplot2)
have$size <- ifelse(have$overlap,2,1)
ggplot(have, aes(datein,datetype,col=datetype,shape=datetype,cex = size)) + geom_point() + 
  facet_grid(rows = vars(id),switch = "y") + 
  geom_vline(xintercept=infectious_periods$datein) + 
  geom_vline(xintercept=infectious_periods$dateout) +  
  theme(strip.text.y.left = element_text(angle = 0)) +
  geom_linerange(aes(xmin = datein, xmax = dateout), color = have$color,size = 2) 

Upvotes: 2

Related Questions