Electrino
Electrino

Reputation: 2890

How to get the vip package to work with an mlr model in R?

I'm not sure what I'm doing wrong here... but I'm trying to use an mlr package created model with the vip package in R. Specifically, im trying to use the vint function from the vip package to calculate the 2-way interaction strength between variables. Here is basically what I'm doing:

library(mlr)
library(vip)
library(randomForest)


# Get example data
airQ <- data.frame(airquality)
airQ <- na.omit(airQ)

# Set up mlr model
airQ_task  <- makeRegrTask(data = airQ, target = "Ozone")
airQ_lrn <- makeLearner("regr.randomForest")
airQ_mod <- train(airQ_lrn, airQ_task)

# For comparison set up randomForest model
aqFit <- randomForest(Ozone~., data = airQ)

# Use vip to get variable interaction
vint(aqFit, feature_names = c("Solar.R", "Temp")) # rf model works
vint(airQ_mod, feature_names = c("Solar.R", "Temp")) # mlr model does not work

Using the code above, the vint function works for the randomForest model but does not work with the mlr model. Instead, for the mlr model, it throws back an error saying:

Error in get_training_data.default(object) : The training data could not be extracted from object. Please supply the raw training data using the train argument in the call to partial.

I think that the vip package was updated recently and they have changed how some functions work... but I can't figure out what I am doing wrong here. I also know that the vip package does accept mlr models because if I try to use the vip variable importance function, everything works fine? e.g.,

vip(airQ_mod)

Any suggestions as to how I would make the vint function work with mlr?

Upvotes: 0

Views: 837

Answers (1)

Lars Kotthoff
Lars Kotthoff

Reputation: 109242

mlr returns a wrapped model; you need to get the underlying R model. The getLearnerModel() function is provided for this purpose:

vint(getLearnerModel(airQ_mod), feature_names = c("Solar.R", "Temp"))

...however, the pdp package (used by vip internally) does some "magic" to get to the original training data, which isn't compatible with how mlr trains its models. You'll have to supply the training data explicitly I'm afraid, as suggested by the error message.

Upvotes: 2

Related Questions