Kaipol
Kaipol

Reputation: 13

Calculate treatment duration of various number of treatments

I try to calculate time to next therapy of patient data pulled from my hospital. Each patient may receive 1 - 4 lines of treatment. This is what my data looks like:

df <- read.table(text = "Patient Treatment   Start
        A   End         2018-11-22
        A   Drug3       2015-03-10
        A   None        2015-02-20
        B   End         2017-11-09
        B   Drug1       2017-01-31
        B   Drug2       2017-05-16
        B   Drug1       2017-02-28
        B   None        2017-03-21
        C   End         2018-11-08
        C   Drug1       2011-08-02
        C   Drug2       2012-01-13
        C   Drug3       2013-12-04", 
    header = TRUE, 
    colClasses = c("character", "character", "Date"))

I tried group_by and summarise. That gave me the results but is not as beautiful and does not support more drugs than the limit:

library(dplyr)

Result <- df %>% 
    group_by(Patient) %>%
    summarise(Regimen1 = Treatment[1],
        Start1 = Start[1],
        TTN1 = (Start[2] - Start[1])/28,
        Regimen2 = Treatment[2],
        Start2 = Start[2],
        TTN2 = (Start[3] - Start[2])/28,
        Regimen3 = Treatment[3],
        Start3 = Start[3],
        TTN3 = (Start[4] - Start[3])/28,
        Regimen4 = Treatment[4],
        Start4 = Start[4],
        TTN4 = (Start[5] - Start[4])/28)

Could you suggest a better solution?

Upvotes: 1

Views: 382

Answers (1)

Joris C.
Joris C.

Reputation: 6234

If the goal is to calculate the Time unTil Next treatment (column TTN) for each patient, you could

  1. arrange the data by (Patient name and) Start date;
  2. group_by Patient name;
  3. mutate an additional column TTN calculating the time-lag (in days) until next treatment.
library(dplyr)

arrange(df, Patient, Start) %>% 
    group_by(Patient) %>%
    mutate(TTN = lead(Start) - Start) %>%
    ungroup()
#> # A tibble: 12 x 4
#>    Patient Treatment Start      TTN      
#>    <chr>   <chr>     <date>     <drtn>   
#>  1 A       None      2015-02-20   18 days
#>  2 A       Drug3     2015-03-10 1353 days
#>  3 A       End       2018-11-22   NA days
#>  4 B       Drug1     2017-01-31   28 days
#>  5 B       Drug1     2017-02-28   21 days
#>  6 B       None      2017-03-21   56 days
#>  7 B       Drug2     2017-05-16  177 days
#>  8 B       End       2017-11-09   NA days
#>  9 C       Drug1     2011-08-02  164 days
#> 10 C       Drug2     2012-01-13  691 days
#> 11 C       Drug3     2013-12-04 1800 days
#> 12 C       End       2018-11-08   NA days

Data

df <- structure(list(Patient = c("A", "A", "A", "B", "B", "B", "B", 
"B", "C", "C", "C", "C"), Treatment = c("End", "Drug3", "None", 
"End", "Drug1", "Drug2", "Drug1", "None", "End", "Drug1", "Drug2", 
"Drug3"), Start = structure(c(17857, 16504, 16486, 17479, 17197, 
17302, 17225, 17246, 17843, 15188, 15352, 16043), class = "Date")), class = "data.frame", row.names = c(NA, 
-12L))

Upvotes: 1

Related Questions