Reputation: 43
I'm currently working with some insurance data to predict in what kind of insured sum class a customer will fall. To achieve this I'm using the AutoML function of the H2O package in R. Now that I have my model I'd like to be able to see which variables/features in my data contribute the most to the predictions the model makes. Is such a thing possible with H2O? If not, what would be another good option to achieve this with R? Thanks!
Upvotes: 3
Views: 1143
Reputation: 466
Definitely possible. If the best fitting model that AutoML has selected is not an ensemble then you can use the following to plot the variable importances (where model is your model
extracted from AutoML),
library(h2o)
h2o.varimp_plot(model)
If the best fitting model is an ensemble then things are a little more complicated. A good option is to use the lime package to look at local importance.
library(h2o)
library(lime)
## Train explainer
explainer <- lime(train, model)
## Get explanations for a subset of samples
explanation <- explain(train[1:5, ], explainer, n_features = 10)
## Plot global explanations
plot_explanations(explanation)
## Plot local explanations
plot_features(explanation)
Upvotes: 4