Reputation: 854
I want to log model with custom predict. Example of signature
from sklearn.ensemble import RandomForestRegressor
class CustomForest(RandomForestRegressor):
def predict(self, X, second_arg=False):
pred = super().predict(X)
value = 1 if second_arg else 0
return pred, value
This model is saved in file model.py
. From here I get the idea to create wrapper to get access to the:
class WrapperPythonModel(mlflow.pyfunc.PythonModel):
"""
Class to train and use custom model
"""
def load_context(self, context):
"""This method is called when loading an MLflow model with pyfunc.load_model(), as soon as the Python Model is constructed.
Args:
context: MLflow context where the model artifact is stored.
"""
import joblib
self.model = joblib.load(context.artifacts["model_path"])
def predict(self, context, model_input):
"""This is an abstract function. We customized it into a method to fetch the model.
Args:
context ([type]): MLflow context where the model artifact is stored.
model_input ([type]): the input data to fit into the model.
Returns:
[type]: the loaded model artifact.
"""
return self.model
And here is how I save and log it:
model = CustomForest()
model.fit(X, y)
model_path = 'model.pkl'
joblib.dump(model, 'model.pkl')
artifacts = {"model_path": model_path}
with mlflow.start_run() as run:
mlflow.pyfunc.log_model(
artifact_path=model_path,
registered_name='model'
python_model=WrapperPythonModel(),
code_path=["models.py"],
artifacts=artifacts,
)
But when I load it and deploy on another machine, I get error module models.py not found
. How can I fix that? I thought specifying code_path
parameter fixes that issues with absent local files.
Upvotes: 4
Views: 2762
Reputation: 11
I was facing the same error ModuleNotFoundError: No module named 'src.custom_code'
In my case, I've created custom code inside a folder named src
, then included it in the log_model:
mlflow.pyfunc.log_model("Master",
python_model=ModelWrapper(model),
signature=signature,
code_path=["src/custom_code.py", "src/model_wrapper.py"],
registered_model_name="Main",
await_registration_for=10,
input_example = X_test.sample(5, random_state=RANDOM_SEED))
Passing the dst_path
when loading, it downloaded all the files to that path:
mlflow.pyfunc.load_model(uri, dst_path="src")
After the download I've moved the files from src\code
folder to src
and run again the load_model
.
In your case I think the models.py
file is on the root so maybe try load with this mlflow.pyfunc.load_model(uri, dst_path="")
and then moving the file from the code
folder to the root.
Upvotes: 1