joseherazo04
joseherazo04

Reputation: 35

Convert an instance of xgboost.Booster into a model that implements the scikit-learn API

I am trying to use mlflow to save a model and then load it later to make predictions.

I'm using a xgboost.XGBRegressor model and its sklearn functions .predict() and .predict_proba() to make predictions but it turns out that mlflow doesn't support models that implements the sklearn API, so when loading the model later from mlflow, mlflow returns an instance of xgboost.Booster, and it doesn't implements the .predict() or .predict_proba() functions.

Is there a way to convert a xgboost.Booster back into a xgboost.sklearn.XGBRegressor object that implements the sklearn API functions?

Upvotes: 3

Views: 3077

Answers (2)

Dimitar Nentchev
Dimitar Nentchev

Reputation: 141

I have a xgboost.core.Booster object and it can make return probability calculations as follows your_Booster_model_object.predict(your_xgboost_dmatrix_dataset).

Upvotes: 0

Waqas
Waqas

Reputation: 6802

Have you tried wrapping up your model in custom class, logging and loading it using mlflow.pyfunc.PythonModel? I put up a simple example and upon loading back the model it correctly shows <class 'xgboost.sklearn.XGBRegressor'> as a type.

Example:

import xgboost as xgb
xg_reg = xgb.XGBRegressor(...)

class CustomModel(mlflow.pyfunc.PythonModel):
    def __init__(self, xgbRegressor):
        self.xgbRegressor = xgbRegressor

    def predict(self, context, input_data):
        print(type(self.xgbRegressor))
        
        return self.xgbRegressor.predict(input_data)

# Log model to local directory
with mlflow.start_run():
     custom_model = CustomModel(xg_reg)
     mlflow.pyfunc.log_model("custome_model", python_model=custom_model)


# Load model back
from mlflow.pyfunc import load_model
model = load_model("/mlruns/0/../artifacts/custome_model")
model.predict(X_test)

Output:

<class 'xgboost.sklearn.XGBRegressor'>
[ 9.107417 ]

Upvotes: 4

Related Questions