user86533
user86533

Reputation: 333

C-statistics and 95% confidence interval for Cox-model with time-dependent covariates

I am performing Cox regression with a time-dependent covariate. I´m specifically interested in calculating the 95% confidence interval of the Concordance index. The standard summary of the coxph model however only returns the Concordance index and its standard error. Is there any possibility to also get the 95% CI?

Thanks!

library(survival)

temp <- subset(pbc, id <= 312, select=c(id:sex, stage))
pbc2 <- tmerge(temp, temp, id=id, death = event(time, status)) #set range
pbc2 <- tmerge(pbc2, pbcseq, id=id, ascites = tdc(day, ascites),
bili = tdc(day, bili), albumin = tdc(day, albumin),
protime = tdc(day, protime), alk.phos = tdc(day, alk.phos))
fit2 <- coxph(Surv(tstart, tstop, death==2) ~ log(bili) + log(protime), pbc2)

summary(fit2)

coxph(formula = Surv(tstart, tstop, death == 2) ~ log(bili) + 
    log(protime), data = pbc2)

  n= 1807, number of events= 125 

                 coef exp(coef) se(coef)      z Pr(>|z|)    
log(bili)     1.24121   3.45981  0.09697 12.800   <2e-16 ***
log(protime)  3.98340  53.69929  0.43589  9.139   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

             exp(coef) exp(-coef) lower .95 upper .95
log(bili)         3.46    0.28903     2.861     4.184
log(protime)     53.70    0.01862    22.853   126.181

**Concordance= 0.886  (se = 0.029 )**
Rsquare= 0.168   (max possible= 0.508 )
Likelihood ratio test= 332.1  on 2 df,   p=<2e-16
Wald test            = 263.3  on 2 df,   p=<2e-16
Score (logrank) test = 467.8  on 2 df,   p=<2e-16

Would it make sense to use the validation function from the RMS package to get a 95% CI for the C-index using bootstrapping? I came up with the following code. What do you think? I´m however not sure how to correctly treat the Dxy values from the training / test colums (the CI from training seems OK to me whereas the CI from the test colums seens very narrow).

library(survival)
library(rms)
library(tidyboot)

temp <- subset(pbc, id <= 312, select=c(id:sex, stage))
pbc2 <- tmerge(temp, temp, id=id, death = event(time, status)) #set range
pbc2 <- tmerge(pbc2, pbcseq, id=id, ascites = tdc(day, ascites),
bili = tdc(day, bili), albumin = tdc(day, albumin),
protime = tdc(day, protime), alk.phos = tdc(day, alk.phos))
fit2 <- cph(Surv(tstart, tstop, death==2) ~ log(bili) + log(protime), pbc2, x=T, y=T, surv=T)
set.seed(1)
output <- capture.output(validate(fit2, method="boot", B=1000, dxy=T, pr =T))
head(output)
output <- as.matrix(output)
output_dxy <- as.matrix(output[grep('^Dxy', output[,1]),])
output_dxy <- gsub("(?<=[\\s])\\s*|^\\s+|\\s+$", "", output_dxy, perl=TRUE)
train <- abs(as.numeric(lapply(strsplit(output_dxy, split=" "), "[", 2))[1:1000])/2+0.5
test <- abs(as.numeric(lapply(strsplit(output_dxy, split=" "), "[", 3))[1:1000])/2+0.5
summary(train)
summary(test)
ci_lower(train, na.rm = FALSE)
ci_upper(train, na.rm = FALSE)
ci_lower(test, na.rm = FALSE)
ci_upper(test, na.rm = FALSE)

Upvotes: 0

Views: 4431

Answers (1)

Frank Harrell
Frank Harrell

Reputation: 2230

As an aside, it is unlikely that the relationships are linear in log bili and log protime. Spline functions in the logs are warranted.

Before using the concordance probability estimate of 0.886 you need to verify from the R survival package that

  • The estimate is meant to handle time-dependent covariates
  • The standard error accounts for the uncertainty of estimating two regression coefficients

If both of those are satisfied, you can get a rough 0.95 confidence interval for the c-index using +- 1.96 se.

Upvotes: 3

Related Questions