Reputation: 237
This is the data I currently have:
df
patient ID Index_admission? adm_date dish_date
1244 FALSE 2/7/2009 2/8/2009
1244 TRUE 3/5/2009 3/15/2009
1244 FALSE 4/5/2011 4/7/2011
1244 FALSE 3/25/2012 3/27/2012
1244 TRUE 5/5/2012 5/20/2012
1244 TRUE 9/8/2013 9/15/2013
1244 FALSE 1/5/2014 1/15/2014
2333 FALSE 1/1/2010 1/8/2010
2333 FALSE 1/1/2011 1/5/2011
2333 TRUE 2/2/2011 2/25/2011
2333 FALSE 1/25/2012 1/28/2012
5422 TRUE 3/5/2015 3/15/2015
1243 TRUE 2/5/2009 2/8/2009
1243 TRUE 2/5/2011 2/19/2011
I need to find the time_to_readmission
from the previous Index_admission
. I will need to add a new column which subtracts the adm_date
from the correct dish_date
. This should only be done if the patient has already has had a TRUE
for Index_admission
.
ALSO the time_to_readmission should always be calculated to the nearest Index_admission
date if the patient has multiple Index_admission
.
Probably easier to explain though looking at how I want the data to look:
df1
patient ID Index_admission? adm_date dish_date time_to_readmission
1244 FALSE 2/7/2009 2/8/2009 NA
1244 TRUE 3/5/2009 3/15/2009 NA
1244 FALSE 4/5/2011 4/7/2011 751
1244 FALSE 3/25/2012 3/27/2012 1106
1244 TRUE 5/5/2012 5/20/2012 1147
1244 TRUE 9/8/2013 9/15/2013 476
1244 FALSE 1/5/2014 1/15/2014 112
2333 FALSE 1/1/2010 1/8/2010 NA
2333 FALSE 1/1/2011 1/5/2011 NA
2333 TRUE 2/2/2011 2/25/2011 NA
2333 FALSE 1/25/2012 1/28/2012 334
5422 TRUE 3/5/2015 3/15/2015 NA
1243 TRUE 2/5/2009 2/8/2009 NA
1243 TRUE 2/5/2011 2/19/2011 727
Please help me with the required coding. Thanks in advance.
> dput(df)
structure(list(patient.ID = c(124L, 124L, 124L, 124L, 124L, 124L,
124L, 233L, 233L, 233L, 233L, 542L, 1243L, 1243L), Index.admission. = c(FALSE,
TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE,
TRUE, TRUE, TRUE), adm_date = structure(c(8L, 10L, 12L, 9L, 13L,
14L, 4L, 1L, 2L, 5L, 3L, 11L, 6L, 7L), .Label = c("1/1/2010",
"1/1/2011", "1/25/2012", "1/5/2014", "2/2/2011", "2/5/2009",
"2/5/2011", "2/7/2009", "3/25/2012", "3/5/2009", "3/5/2015",
"4/5/2011", "5/5/2012", "9/8/2013"), class = "factor"), dish_date = structure(c(7L,
8L, 11L, 10L, 12L, 13L, 1L, 4L, 3L, 6L, 2L, 9L, 7L, 5L), .Label = c("1/15/2014",
"1/28/2012", "1/5/2011", "1/8/2010", "2/19/2011", "2/25/2011",
"2/8/2009", "3/15/2009", "3/15/2015", "3/27/2012", "4/7/2011",
"5/20/2012", "9/15/2013"), class = "factor")), .Names = c("patient.ID",
"Index.admission.", "adm_date", "dish_date"), class = "data.frame", row.names = c(NA,
-14L))
Upvotes: 3
Views: 111
Reputation: 4907
This should work. Note that i get a data.table
type-error when I run it, but the answer is correct.
One caveat here is that this calculates the time to readmit from the first dish_date
meeting your criteria, which is what you request in the post "subtracts the adm_date
from the dish_date
(of a previous row) ". You don't specify which previous row... I'm taking the first dish_date
meeting your criteria.
From you example output, that's not exactly what you're doing. Instead, it appears you have some unclear criteria on how to choose the "of a previous row." It's not clear what this rule is. Clarify the question if you want a different output
calc_readmit <- function(df) {
if (nrow(df) == 1) return(NA)
admitted <- c(0,cumsum(df$Index_admission))
admitted <- admitted[-length(admitted)]
dt1 <- df$dish_date[min(which(admitted > 0))-1]
admit2 <- ifelse(admitted > 0, dt1, NA)
time <- as.integer(df$adm_date) - admit2
as.integer(ifelse(admitted > 0, time, NA))
}
library(data.table)
df <- data.table(df, key= "id")
df <- df[, time_to_readmission := calc_readmit(.SD), by= "id"]
R> df
id Index_admission. adm_date dish_date time_to_readmission
1: 1243 TRUE 2009-02-05 2009-02-08 NA
2: 1243 TRUE 2011-02-05 2011-02-19 727
3: 1244 FALSE 2009-02-07 2009-02-08 NA
4: 1244 TRUE 2009-03-05 2009-03-15 NA
5: 1244 FALSE 2011-04-05 2011-04-07 751
6: 1244 FALSE 2012-03-25 2012-03-27 1106
7: 1244 TRUE 2012-05-05 2012-05-20 1147
8: 1244 TRUE 2013-09-08 2013-09-15 1638
9: 1244 FALSE 2014-01-05 2014-01-15 1757
10: 2333 FALSE 2010-01-01 2010-01-08 NA
11: 2333 FALSE 2011-01-01 2011-01-05 NA
12: 2333 TRUE 2011-02-02 2011-02-25 NA
13: 2333 FALSE 2012-01-25 2012-01-28 334
14: 5422 TRUE 2015-03-05 2015-03-15 NA
Upvotes: 1