How to divide imputed data with 'survSplit' function and finally pool the analysis results using 'pool' function in MICE package

Question

When I do cox regression using R, I found the proporitonal hazard assumption is violated. So I decided to divide the data into different time intervals using the 'survSplit' function and calculate the HR for each time interval.

For example:

1) divide data into different time intervals

`D1 <- survSplit(Surv(time, status) ~ ., data= data1, cut=c(40, 60),
episode= "tgroup", id="id")`

2) to get HR for each time interval

`coxph(Surv(tstart, time, status) ~ age + class:strata(tgroup), data=D1)`

Now I want to do the same for multiply-imputed datasets (n=5).

step 1 data imputation

imp <- mice(data1, seed = 12345, n=5,print = FALSE)

step 2 reanalysis

fit <- with(imp, coxph(Surv(tstart, time, status) ~ age + class:strata(tgroup))

step 3 pool the results

`summary(pool(fit))`

My question is how I can post process the imputed data to divide each imputed dataset into different time intervals using 'survSplit' between step 1 and step 2. After post-processing, I can continue to do step 2 and step3.

Although I can extract each imputed dataset and then divide it using 'survSplit', I am not able to continue step2 and step 3 using the MICE package. Kindly help. Thanks.

David L Thiessen · Accepted Answer

In this case the best approach is to use {} inside the with() command, which will allow you to place multiple statements inside. Here's a demonstration, based on the example in the survSplit() documentation.

library(survival)
library(mice)
set.seed(123)

# use the veteran dataset with some values made missing
veteran2 <- veteran
veteran2[which(rbinom(nrow(veteran2),1,0.1) == 1),
         "karno"] <- NA
# make imputations
imps <- mice(veteran2, printFlag = FALSE)
# fit the models, with survSplit used inside each call
mods <- with(imps, 
             {
               dat <- data.frame(mget(ls()))
               dat_split <- survSplit(Surv(time, status) ~ ., 
                                      data = dat,
                                      cut = c(60, 120), 
                                      episode = "timegroup")
               coxph(Surv(tstart, time, status) ~ karno*strata(timegroup), 
                     data = dat_split)
             })
pooled <- pool(mods)
summary(pooled)
#>                                 term    estimate   std.error statistic
#> 1                              karno -0.04986785 0.007715564 -6.463279
#> 2 karno:strata(timegroup)timegroup=2  0.03645985 0.015247181  2.391252
#> 3 karno:strata(timegroup)timegroup=3  0.04104428 0.013389159  3.065486
#>          df      p.value
#> 1  70.51041 1.141572e-08
#> 2  94.68223 1.876833e-02
#> 3 108.83144 2.740870e-03

c(t0_60 = pooled$pooled$estimate[1],
  t60_120 = sum(pooled$pooled$estimate[c(1,2)]),
  t_120 = sum(pooled$pooled$estimate[c(1,3)]))
#>        t0_60      t60_120        t_120 
#> -0.049867847 -0.013407999 -0.008823565

^{Created on 2022-08-24 by the reprex package (v2.0.1)}

How to divide imputed data with 'survSplit' function and finally pool the analysis results using 'pool' function in MICE package

1) divide data into different time intervals

2) to get HR for each time interval

step 1 data imputation

step 2 reanalysis

step 3 pool the results

Answers (1)

Related Questions

How to divide imputed data with &#39;survSplit&#39; function and finally pool the analysis results using &#39;pool&#39; function in MICE package

1) divide data into different time intervals

2) to get HR for each time interval

step 1 data imputation

step 2 reanalysis

step 3 pool the results

Answers (1)

Related Questions

How to divide imputed data with 'survSplit' function and finally pool the analysis results using 'pool' function in MICE package