user9371378
user9371378

Reputation: 1

Fix survival() environment "Error in is.data.frame" error using pryr or other tidyverse tools

I am struggling to find an elegant resolution for an environment issue that I continually run into using the survival library in R. Here is a toy example of the problem I am having:

# Make a fake data set
set.seed(1)
BigData <- cbind(rexp(100, .3), rbinom(100, 1, .7), 
             matrix(rnorm(300), ncol = 3)) %>% data.frame
names(BigData) <- c('time','event', 'var1', 'var2', 'var3')

# Make n function for fitting the model (myFitFunction).
# I am allowed to edit this function.
myFitFunction <- function(origdata, formula, ...){
  fit <- coxph(formula, data = origdata, ...)
  return(fit)
}
# There exists a function for fitting the 
# same model with new data (otherFitFunction).
# For the purposes of this example, say I cannot edit this one.
otherFitFunction <- function(object, newdata){
  survfit(object, newdata=newdata)
}
myMod <- myFitFunction(BigData[1:75,], 
        as.formula(Surv(time, event) ~ var1+var2+var3))
otherFitFunction(myMod, BigData[76:100,])

This gives me the error message:

"Error in is.data.frame(data) : object 'origdata' not found Calls: otherFitFunction ... -> model.frame.default -> is.data.frame"

I know this is a common issue, particularly when doing cross-validation, and there are some solutions out there such as those found in: in R: Error in is.data.frame(data) : object '' not found, C5.0 plot. (More specifically, I know the issue in this example comes from stats::model.frame() code in line ~55 in the "survfit.coxph.R" file from the survival package.) Through reading other posts on stackexchange, I have found a solution to my problem, which is to adjust myFitFunction() to:

myFitFunction <- function(origdata, formula, ...){
  myenv$origdata <- origdata
  fit <- coxph(formula, data = origdata, ...)
  environment(fit$formula) <- myenv
  fit$terms <- terms(fit$formula)

  return(fit)
}

However, all code I have seen or used seems very hacky (including mine, which requires me to save origdata every time). Additionally, in my real code, I cannot actually edit otherFitFunction() and can only edit or even directly access myFitFunction(), which limits my ability to use some of the solutions others have used.

I am wondering if there is a more elegant solution to this issue. I've tried playing with pryr package but cannot seem to come up with anything that works.

Any help would be much appreciated.

Upvotes: 0

Views: 345

Answers (1)

MrFlick
MrFlick

Reputation: 206242

How about

myFitFunction <- function(origdata, formula, ...){
  environment(formula) <- environment()
  fit <- coxph(formula, data = origdata, ...)
  return(fit)
}

Since formulas can capture environments, you just need to capture the environment where origdata is defined.

An alternative is to adjust to the call in myFitFunction to run everything in the parent frame using the original variables. For example

myFitFunction <- function(origdata, formula, ...){
  call <- match.call()
  call$formula <- formula
  call$data <- call$origdata
  call$origdata <- NULL
  call[[1]] <- quote(coxph)
  fit <- eval.parent(call)
  return(fit)
}

Upvotes: 1

Related Questions