Rproject
Rproject

Reputation: 21

How do I change confidence interval calculation to log-log on R?

If I wanted to calculate confidence intervals using the coxph and confinf functions, how do I change the confidence interval calculation to log-log? My understanding is that log is the default.

I tried conf.type="log-log" but it did not work, just got an error message


library(survival) 

coxph(formula = Surv(futime, fustat) ~ tx, data = tki, conf.type="log-log") 

fit <- coxph(formula = Surv(futime, fustat) ~ tx, data = tki)

summary(fit) 

#output provides HR CIs 
confint(fit) 

#coefficient CIs 
exp(confint(fit)) 

> dput(tki) structure(list(futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3,  8.11, 7.29, 6.75, 6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78,  2.65, 1.72, 3.63), fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,  1, 1, 1, 0, 0, 0, 1, 1, 1), tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1,  1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), class = c("tbl_df", "tbl",  "data.frame"), row.names = c(NA, -20L)) 

Upvotes: 2

Views: 1237

Answers (1)

David L Thiessen
David L Thiessen

Reputation: 694

The log(-log()) method of producing confidence intervals applies to the estimates of survival probabilities (or related transformations). We need to use the survfit() function to create those, confint() will not work. The default method is log as you note. Supplying conf.type = "log-log" changes the method as desired.

Note however, that the use of survfit() with a coxph() model almost always requires you to correctly specify the newdata = option. See the details in ?survfit.coxph. Ignoring this will usually still give you some result, but it will be wrong in subtle ways.

library(survival) 
tki = structure(list(
  futime = c(9.26, 11.06, 2.35, 3.75, 12.4, 10.3,  8.11, 7.29, 6.75, 
             6.56, 0.26, 1.9, 0.34, 1.63, 1.55, 1.6, 4.78,  2.65, 1.72, 3.63), 
  fustat = c(1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,  1, 1, 1, 0, 0, 0, 1, 1, 1), 
  tx = c(1, 1, 1, 1, 1, 1, 1, 1, 1,  1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), 
  class = c("tbl_df", "tbl",  "data.frame"), row.names = c(NA, -20L)) 
coxmod <- coxph(Surv(futime, fustat) ~ tx, data = tki)
summary(coxmod)
#> Call:
#> coxph(formula = Surv(futime, fustat) ~ tx, data = tki)
#> 
#>   n= 20, number of events= 11 
#> 
#>      coef exp(coef) se(coef)     z Pr(>|z|)   
#> tx 2.2783    9.7599   0.8335 2.733  0.00627 **
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#>    exp(coef) exp(-coef) lower .95 upper .95
#> tx      9.76     0.1025     1.905        50
#> 
#> Concordance= 0.744  (se = 0.042 )
#> Likelihood ratio test= 9.41  on 1 df,   p=0.002
#> Wald test            = 7.47  on 1 df,   p=0.006
#> Score (logrank) test = 10.34  on 1 df,   p=0.001
confint(coxmod) #confint provides confidence interval for coefficient estimate
#>       2.5 %   97.5 %
#> tx 0.644588 3.911983
exp(confint(coxmod)) #confidence interval for hazard ratio
#>       2.5 %   97.5 %
#> tx 1.905202 49.99797

# We need to define 'newdata' to use in survfit, include all covariates.
# If you don't use newdata, the results are almost certainly not what you want.
newdat = data.frame(tx = c(1, 2))

# Fit model for survival function using survfit. Set confidence interval type
survmod <- survfit(coxmod, newdata = newdat, conf.type = "log-log")

# Upper and lower confidence intervals for survival probability, 1 column
# for each row of newdata.
survmod$upper
#>               1         2
#>  [1,] 0.9992040 0.9873779
#>  [2,] 0.9974256 0.9539678
#>  [3,] 0.9974256 0.9539678
#>  ...
survmod$lower
#>               1            2
#>  [1,] 0.8972081 5.232360e-01
#>  [2,] 0.8626622 4.631231e-01
#>  [3,] 0.8626622 4.631231e-01
#>  ...
plot(survmod, conf.int = T, col = c("red","blue"))

Created on 2023-02-21 with reprex v2.0.2


Reply to comment:

is there a way to simply generate a hazard ratio and a 95% confidence interval calculated with the log-log method?

I think you have misunderstood something about how confidence intervals are calculated and what the different methods are estimating.

By default, coxph() estimates the variance of the parameters using the inverse of the information matrix. An alternative is to use a "robust" variance estimate by entering robust = TRUE in the coxph() call. (The robust estimate uses an infintisimal jackknife, see 2.7 of the package documentation vignette, https://cran.r-project.org/web/packages/survival/vignettes/survival.pdf .) The confint() function then uses the estimated variance to construct a "wald-type" confidence interval for the parameter estimates. Finally, taking the exp() of the confidence interval for the parameter estimate gives a confidence interval for the hazard ratio. (The code included in the question seems to have a mistake in labeling the confidence intervals. But, you can see the result matches summary(fit).)

At no point in the above calculations was a log or a log-log confidence interval used or available. If you try to include a conf.type = option in the coxph() call then it gives an error and refuses to run.

The log method of calculating confidence intervals is the default for the survfit() function. By specifying conf.type = option in the survfit() function you can change the method used from "log" to any of the supported options: "log", "log-log", "plain", "none", "logit", or "arcsin". survfit() uses these methods to calculate a confidence interval for the survival probability. survfit() does not estimate the hazard ratio, it bases the estimate of the survival probability off of the hazard ratio that was estimated by coxph().

Upvotes: 2

Related Questions