Reputation: 13
I try to calculate time to next therapy of patient data pulled from my hospital. Each patient may receive 1 - 4 lines of treatment. This is what my data looks like:
df <- read.table(text = "Patient Treatment Start
A End 2018-11-22
A Drug3 2015-03-10
A None 2015-02-20
B End 2017-11-09
B Drug1 2017-01-31
B Drug2 2017-05-16
B Drug1 2017-02-28
B None 2017-03-21
C End 2018-11-08
C Drug1 2011-08-02
C Drug2 2012-01-13
C Drug3 2013-12-04",
header = TRUE,
colClasses = c("character", "character", "Date"))
I tried group_by
and summarise
. That gave me the results but is not as beautiful and does not support more drugs than the limit:
library(dplyr)
Result <- df %>%
group_by(Patient) %>%
summarise(Regimen1 = Treatment[1],
Start1 = Start[1],
TTN1 = (Start[2] - Start[1])/28,
Regimen2 = Treatment[2],
Start2 = Start[2],
TTN2 = (Start[3] - Start[2])/28,
Regimen3 = Treatment[3],
Start3 = Start[3],
TTN3 = (Start[4] - Start[3])/28,
Regimen4 = Treatment[4],
Start4 = Start[4],
TTN4 = (Start[5] - Start[4])/28)
Could you suggest a better solution?
Upvotes: 1
Views: 382
Reputation: 6234
If the goal is to calculate the Time unTil Next treatment (column TTN
) for each patient, you could
arrange
the data by (Patient
name and) Start
date; group_by
Patient
name;mutate
an additional column TTN
calculating the time-lag (in days) until next treatment.library(dplyr)
arrange(df, Patient, Start) %>%
group_by(Patient) %>%
mutate(TTN = lead(Start) - Start) %>%
ungroup()
#> # A tibble: 12 x 4
#> Patient Treatment Start TTN
#> <chr> <chr> <date> <drtn>
#> 1 A None 2015-02-20 18 days
#> 2 A Drug3 2015-03-10 1353 days
#> 3 A End 2018-11-22 NA days
#> 4 B Drug1 2017-01-31 28 days
#> 5 B Drug1 2017-02-28 21 days
#> 6 B None 2017-03-21 56 days
#> 7 B Drug2 2017-05-16 177 days
#> 8 B End 2017-11-09 NA days
#> 9 C Drug1 2011-08-02 164 days
#> 10 C Drug2 2012-01-13 691 days
#> 11 C Drug3 2013-12-04 1800 days
#> 12 C End 2018-11-08 NA days
Data
df <- structure(list(Patient = c("A", "A", "A", "B", "B", "B", "B",
"B", "C", "C", "C", "C"), Treatment = c("End", "Drug3", "None",
"End", "Drug1", "Drug2", "Drug1", "None", "End", "Drug1", "Drug2",
"Drug3"), Start = structure(c(17857, 16504, 16486, 17479, 17197,
17302, 17225, 17246, 17843, 15188, 15352, 16043), class = "Date")), class = "data.frame", row.names = c(NA,
-12L))
Upvotes: 1