Peanut
Peanut

Reputation: 835

rpart has problems to find variables when used inside a function

I have the following problem. I write a function in which I compute at one point classification trees using the rpart package.

Inside the function I initialise weights for the trees. However, I get an error that the rpart function can't find the weight variable (This is the exact error message: Error in eval(expr, envir, enclos) : object 'w' not found).

When I run my code outside the function it works perfectly fine. At the end you can find a small toy example of my problems. I don't really understand whats going on. Can it be, that rpart looks for the variables in the global environment?

Toy example of my problems:

# Load Package
library(rpart)

# Create simple wrapper function for rpart
example <- function( form, data ){
  N <- nrow( data )
  w <- rep( 1/N , N ) 
  tree <- rpart( form , data = data, weights = w )
  return( tree )
}

# Get adjust and data set / define model
df      <- mtcars
df$mpg  <- as.factor( ifelse( df$mpg < 15 , 1 , 0 ) )
model <-  formula( mpg ~ . )

# Run function - THIS PRODUCES AND ERROR
test <- example( model, df  )

# Re-run the same outisde the function - THIS WORKS
N <- nrow( df )
w <- rep( 1/N , N ) 
rpart( model , data = df, weights =  w )

Upvotes: 4

Views: 423

Answers (2)

MrFlick
MrFlick

Reputation: 206167

Actually rpart() looks for variables in the environment specified by the formula. Formulas in R actually contain a reference to the environment where they were created (defined). Since you created your formula in the global environment, your variables are searched there (if not found in the data.frame). You can change the environment if you like

example <- function( form, data ){
  environment(form)<-environment()
  N <- nrow( data )
  w <- rep( 1/N , N ) 
  tree <- rpart( form , data = data, weights = w )
  return( tree )
}

But mixing variables from environments and data.frames can get tricky so be careful.

Upvotes: 3

Hong Ooi
Hong Ooi

Reputation: 57686

Add the column of weights to the data frame:

example <- function( form, data ){
  N <- nrow( data )
  data$w <- rep( 1/N , N )                           # new column in data
  tree <- rpart( form , data = data, weights = w )
  return( tree )
}

Upvotes: 2

Related Questions