Reputation: 17175
I have a simple dataset containing the "date in" and "date out" of a healthcare facility, and date type (inpatient, outpatient, and infectious period) for each patient. I need to determine if a patient overlapped with an infectious period of another patient. I can typically do this using the lubridate
package's interval
and int_overlaps
functions. My specific issue is when there are multiple infectious periods that do not overlap.
I am using R. Code to reproduce sample data and the figure is below.
I want to flag each visit with a logical T/F if it falls within the interval of an infectious period. The below figure may help visualize these data. Red rectangles are inpatient stays, and red circles are outpatient visits. The purple is the infectious period during a patient's inpatient stay. Only inpatient/outpatient visit that overlap with a purple interval should be flagged (i.e., with a logical TRUE
or FALSE
). Ideally the patient that is causing the infectious period would not be flagged (i.e. the long inpatient stay for K00005
would return F
, but I can work around that if that causes complications.
I have tried:
library(tidyverse); library(lubridate);
test <- have %>% mutate(Int=interval(datein, dateout),
overlaps=map(seq_along(Int), function(x){
y=setdiff(seq_along(Int),x)
return(any(int_overlaps(Int[x],Int[y])))
}))
I feel like I'm close, but this seems to check each interval against all the intervals, not just the infectious periods, so they all come back TRUE
. Is there a way to have it only check the intervals from the infectious period (either through modifying the above or with fresh code)?
I have search SO and have read several questions/responses that deal with similar issues, but none of them are solving this specific issue. Any help would be much appreciated!
library(tidyverse); library(lubridate);
have <- structure(list(id = c("K00005", "K52253", "K32022", "K20113",
"K52253", "K00164", "K00164", "K10003", "K00347", "K00046", "K52253",
"K00198", "K32022", "K00198", "K00685", "K00685", "K18122", "K00198",
"K00347", "K00198", "K00198", "K32022", "K52135", "K34060", "K00164",
"K04048", "K00135", "K32022", "K00685", "K00198", "K52253", "K30008",
"K32022", "K32022", "K00347", "K00164", "K00135", "K00137", "K32022",
"K32022", "K52253", "K00005", "K00046", "K00137"),
datetype = c("Inpatient", "Outpatient", "Inpatient", "Outpatient", "Outpatient", "Outpatient",
"Outpatient", "Outpatient", "Outpatient", "Inpatient", "Outpatient",
"Inpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient",
"Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient",
"Outpatient", "Outpatient", "Outpatient", "Outpatient", "Outpatient",
"Outpatient", "Outpatient", "Outpatient", "Inpatient", "Outpatient",
"Outpatient", "Outpatient", "Outpatient", "Outpatient", "Inpatient",
"Inpatient", "Inpatient", "Outpatient", "Outpatient", "Outpatient",
"Infectious Period", "Infectious Period", "Infectious Period"),
datein = structure(c(17542, 17544, 17556, 17559, 17586, 17602,
17608, 17623, 17626, 17626, 17641, 17642, 17651, 17657, 17659,
17661, 17664, 17668, 17668, 17675, 17675, 17676, 17681, 17685,
17699, 17703, 17712, 17712, 17713, 17719, 17721, 17739, 17739,
17742, 17745, 17746, 17755, 17760, 17768, 17768, 17788, 17542,
17626, 17760), class = "Date"),
dateout = structure(c(17733, 17544, 17560, 17559, 17586, 17602, 17608, 17623, 17626, 17638,
17641, 17655, 17651, 17657, 17659, 17661, 17664, 17668, 17668,
17675, 17675, 17676, 17681, 17685, 17699, 17703, 17712, 17712,
17713, 17795, 17721, 17739, 17739, 17742, 17745, 17753, 17762,
17794, 17768, 17768, 17788, 17564, 17638, 17777), class = "Date"),
color = c("#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39", "#DD4B39",
"purple", "purple", "purple")),
row.names = c(NA, -44L), class = c("tbl_df", "tbl", "data.frame"))
require(vistime)
gg_vistime(have,
col.group="id",
col.event="datetype",
col.start="datein",
col.end="dateout",
col.color = "color",
show_labels = FALSE,
optimize_y = TRUE,
# linewidth = 5,
title = "Figure for Stack Overflow Question")
Upvotes: 2
Views: 220
Reputation: 3194
Here is a freshcode Base R approach (no libraries) using a very basic for loop. If the patient checked in while there was an infection (started_during
), or left while there was an infection (ended_during
), or was inpatient while a infectious period began and ended (in_during
) it should flag the overlap as TRUE.
infectious_periods <- have[which(have$datetype=="Infectious Period"),]
have$overlap <- FALSE # initializes a new column
for(i in 1:nrow(have)){
if(have$datetype[i] != "Infectious Period"){
started_during <- any(have$datein[i] >= infectious_periods$datein &
have$datein[i] <= infectious_periods$dateout)
ended_during <- any(have$dateout[i] >= infectious_periods$datein &
have$dateout[i] <= infectious_periods$dateout)
in_during <- any(have$datein[i] >= infectious_periods$datein &
have$dateout[i] <= infectious_periods$dateout)
if(started_during | ended_during | in_during){
have$overlap[i] <- TRUE
}
}
}
have$overlap
# A tibble: 44 x 6
# id datetype datein dateout color overlap
# <chr> <chr> <date> <date> <chr> <lgl>
# 1 K00005 Inpatient 2018-01-11 2018-07-21 #DD4B39 TRUE
# 2 K52253 Outpatient 2018-01-13 2018-01-13 #DD4B39 TRUE
# 3 K32022 Inpatient 2018-01-25 2018-01-29 #DD4B39 TRUE
# 4 K20113 Outpatient 2018-01-28 2018-01-28 #DD4B39 TRUE
# 5 K52253 Outpatient 2018-02-24 2018-02-24 #DD4B39 FALSE
# 6 K00164 Outpatient 2018-03-12 2018-03-12 #DD4B39 FALSE
# 7 K00164 Outpatient 2018-03-18 2018-03-18 #DD4B39 FALSE
# 8 K10003 Outpatient 2018-04-02 2018-04-02 #DD4B39 FALSE
# 9 K00347 Outpatient 2018-04-05 2018-04-05 #DD4B39 TRUE
#10 K00046 Inpatient 2018-04-05 2018-04-17 #DD4B39 TRUE
# ... with 34 more rows
If this doesn't solve your needs let me know. There is definitely more that could be done like a count of the other infectious periods the patient overlaps with, but this should get you started.
Code for pic
library(ggplot2)
have$size <- ifelse(have$overlap,2,1)
ggplot(have, aes(datein,datetype,col=datetype,shape=datetype,cex = size)) + geom_point() +
facet_grid(rows = vars(id),switch = "y") +
geom_vline(xintercept=infectious_periods$datein) +
geom_vline(xintercept=infectious_periods$dateout) +
theme(strip.text.y.left = element_text(angle = 0)) +
geom_linerange(aes(xmin = datein, xmax = dateout), color = have$color,size = 2)
Upvotes: 2