Find out the most contributing variables/features of an R H2O AutoML model?

Question

I'm currently working with some insurance data to predict in what kind of insured sum class a customer will fall. To achieve this I'm using the AutoML function of the H2O package in R. Now that I have my model I'd like to be able to see which variables/features in my data contribute the most to the predictions the model makes. Is such a thing possible with H2O? If not, what would be another good option to achieve this with R? Thanks!

Sam Abbott · Accepted Answer

Definitely possible. If the best fitting model that AutoML has selected is not an ensemble then you can use the following to plot the variable importances (where model is your model extracted from AutoML),

library(h2o)
h2o.varimp_plot(model)

If the best fitting model is an ensemble then things are a little more complicated. A good option is to use the lime package to look at local importance.

 library(h2o)
 library(lime)

 ## Train explainer
 explainer <- lime(train, model)

 ## Get explanations for a subset of samples
 explanation <- explain(train[1:5, ], explainer, n_features = 10)

 ## Plot global explanations
 plot_explanations(explanation)

 ## Plot local explanations
 plot_features(explanation)

Find out the most contributing variables/features of an R H2O AutoML model?

Answers (1)

Related Questions