Blue Otter Hat
Blue Otter Hat

Reputation: 617

Extract feature importance from a mlflow 1.9 PyFuncModel model

Top line: How can I extract feature importance from an xgboost model that has been saved in mlflow as a PyFuncModel?

Details:

import mlflow
import shap

model = mlflow.pyfunc.load_model(model_load_details)  
print(f"model {type(model)})") 
# model <class 'mlflow.pyfunc.PyFuncModel'>)

explainer = shap.Explainer(model)

... which returns the error message "Exception: The passed model is not callable and cannot be analyzed directly with the given masker! Model: mlflow.pyfunc.loaded_model:"

My own thinking: Extract the parameter settings for the best model from mlflow, use these to retrain fresh xgboost model, then save as an xgboost flavor: From here, then use mlflow.xgboost.save_model(). But, is there a better way?

Upvotes: 0

Views: 2477

Answers (1)

Kenshin
Kenshin

Reputation: 19

You can get feature importance like that:

Setting mlflow configurations
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)  
mlflow.set_experiment(EXPERIMENT_NAME)
Reading model from mlflow
loaded_model = mlflow.pyfunc.load_model(f"models:/{MODEL_NAME_MLFLOW}/staging")
Getting Top 10 Features
data = {'feature_name':loaded_model._model_impl.python_model.model.feature_name_,    
'imp':loaded_model._model_impl.python_model.model.feature_importances_}


fi = pd.DataFrame(data).sort_values(by='imp', ascending=False)
top_10_features = fi.head(10)['feature_name'].to_list()

Upvotes: 1

Related Questions