Nourless
Nourless

Reputation: 854

How to load model from mlflow with custom predict without local file?

I want to log model with custom predict. Example of signature

from sklearn.ensemble import RandomForestRegressor
class CustomForest(RandomForestRegressor):
    def predict(self, X, second_arg=False):
        pred = super().predict(X)
        value = 1 if second_arg else 0
        return pred, value

This model is saved in file model.py. From here I get the idea to create wrapper to get access to the:

class WrapperPythonModel(mlflow.pyfunc.PythonModel):
    """
    Class to train and use custom model
    """

    def load_context(self, context):
        """This method is called when loading an MLflow model with pyfunc.load_model(), as soon as the Python Model is constructed.
        Args:
            context: MLflow context where the model artifact is stored.
        """
        import joblib

        self.model = joblib.load(context.artifacts["model_path"])

    def predict(self, context, model_input):
        """This is an abstract function. We customized it into a method to fetch the model.
        Args:
            context ([type]): MLflow context where the model artifact is stored.
            model_input ([type]): the input data to fit into the model.
        Returns:
            [type]: the loaded model artifact.
        """
        return self.model

And here is how I save and log it:

model = CustomForest()
model.fit(X, y)
    

model_path = 'model.pkl'
joblib.dump(model, 'model.pkl')

artifacts = {"model_path": model_path}


with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        artifact_path=model_path,
        registered_name='model'
        python_model=WrapperPythonModel(),
        code_path=["models.py"],
        artifacts=artifacts,
    )

But when I load it and deploy on another machine, I get error module models.py not found. How can I fix that? I thought specifying code_path parameter fixes that issues with absent local files.

Upvotes: 4

Views: 2762

Answers (1)

Tiago Santos
Tiago Santos

Reputation: 11

I was facing the same error ModuleNotFoundError: No module named 'src.custom_code'

In my case, I've created custom code inside a folder named src, then included it in the log_model:

 mlflow.pyfunc.log_model("Master", 
                                python_model=ModelWrapper(model),
                                signature=signature,
                                code_path=["src/custom_code.py", "src/model_wrapper.py"],
                                registered_model_name="Main",
                                await_registration_for=10,
                                input_example = X_test.sample(5, random_state=RANDOM_SEED))

Passing the dst_path when loading, it downloaded all the files to that path:

mlflow.pyfunc.load_model(uri, dst_path="src")

After the download I've moved the files from src\code folder to src and run again the load_model.

In your case I think the models.py file is on the root so maybe try load with this mlflow.pyfunc.load_model(uri, dst_path="") and then moving the file from the code folder to the root.

Upvotes: 1

Related Questions