Andreas
Andreas

Reputation: 1136

R formula in survfit

I am having trouble with formulas, environments, and survfit().

Things work fine for lm() but they fail for survfit().

General problem statement:

I am fitting a series of formulas to some data. So, I call the modeling function with the formula passed as a variable. Later, I want to work with the formula from the fitted object.

(From my naive point of view, the trouble comes from survfit not recording the environment.)

Detailed Example

Expected behaviour as seen in lm():

library("plyr")

preds <- c("wt", "qsec")

f <- function() {
  lm(mpg ~ wt, data = mtcars)
}

fits <- alply(preds, 1, function(pred)
{
  modform <- reformulate(pred, response = "mpg")

  lm(modform, data = mtcars)
})

fits[[1]]$call$formula
##modform
formula(fits[[1]])
## mpg ~ wt
## <environment: 0x1419d1a0>

Even though fits[[1]]$call$formula resolves to modform I can still get the original formula with formula(fits[[1]]).

But things fail for survfit():

library("plyr")
library("survival")

preds <- c("resid.ds", "rx", "ecog.ps")

fits <- 
  alply(preds, 1, function(pred)
  {
    modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
    modform <- as.formula(modform)
    print(modform)

    fit <- survfit(modform, data = ovarian)
  })

fits[[1]]$call$formula
## modform
formula(fits[[1]])
## Error in eval(expr, envir, enclos) : object 'modform' not found

Here (and in contrast to lm-fits), formula(fits[[1]]) does not work!

So, my specific question is: How can I get back the formula used to fit fits[[1]]?

Upvotes: 1

Views: 2300

Answers (1)

rawr
rawr

Reputation: 20811

The issue is that when x$formula is NULL, for an lm object there is a backup plan to get the formula; this doesn't exist for survfit objects

library("plyr")
library("survival")

preds <- c("wt", "qsec")
f <- function() lm(mpg ~ wt, data = mtcars)

fits <- alply(preds, 1, function(pred) {
  modform <- reformulate(pred, response = "mpg")
  lm(modform, data = mtcars)
})

fits[[1]]$formula
# NULL

The formula can be extracted with formula(fits[[1]]) which uses the formula generic. The lm S3 method for formula is

stats:::formula.lm

# function (x, ...) 
# {
#   form <- x$formula
#   if (!is.null(form)) {
#     form <- formula(x$terms)
#     environment(form) <- environment(x$formula)
#     form
#   }
#   else formula(x$terms)
# }

So when fits[[1]]$formula returns NULL, forumla.lm looks for a terms attribute in the object and finds the formula that way

fits[[1]]$terms

The survfit objects don't have x$formula or x$terms, so formula(x) givens an error

preds <- c("resid.ds", "rx", "ecog.ps")
fits <-  alply(preds, 1, function(pred) {
    modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
    modform <- as.formula(modform)
    fit <- survfit(modform, data = ovarian)
  })

fits[[1]]$formula
# NULL

formula(fits[[1]]) ## error

formula(fits[[1]]$terms)
# list()

You can fix this by inserting the formula into the call and evaluating it

modform <- as.formula(paste("Surv(futime, fustat)", 'rx', sep = " ~ "))
substitute(survfit(modform, data = ovarian), list(modform = modform))
# survfit(Surv(futime, fustat) ~ rx, data = ovarian)

eval(substitute(survfit(modform, data = ovarian), list(modform = modform)))

# Surv(futime, fustat) ~ rx

# Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
# 
#      n events median 0.95LCL 0.95UCL
# rx=1 13      7    638     268      NA
# rx=2 13      5     NA     475      NA

Or by manually putting the formula into x$call$formula

fit <- survfit(modform, data = ovarian)
fit$call$formula
# modform
fit$call$formula <- modform
fit$call$formula
# Surv(futime, fustat) ~ rx

fit
# Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
# 
#      n events median 0.95LCL 0.95UCL
# rx=1 13      7    638     268      NA
# rx=2 13      5     NA     475      NA

Upvotes: 5

Related Questions