Reputation: 31
I am trying to build a docker image that I could host with an endpoint of my model and have experienced issues on how to make my code available during the build so that the image would later run and serve the model.
I am trying a model which is a sklearn type of classifier, however it is wrapped in my modified version of sklearn.Pipeline
class, let's call it a BypassablePipeline
. The model trains fine, I am logging it into the mlflow tracking server with, I am also able to pull it again and execute it locally. This is how I log the model:
model_info = mlflow.sklearn.log_model(model,
artifact_path='model',
registered_model_name=model_registry_name,
signature=model_signature,
pip_requirements=my_pip_requirements)
I have the BypassablePipeline
in mymodule
and installed it in development mode, my requirements.txt
file example, omitting other packages for simplicity sake:
scikit_learn==1.2.0
mlflow==2.1.1
-e .
. is where I locally have the setup.py
for my mymodule
and installs fine. When I execute pip list
this is what I am getting:
scikit_learn 1.2.0
mlflow 2.1.1
mymodule 0.0.1 /home/my_user/my_repo/
plus it is working locally.
These requirements are passed as my_pip_requirements
to the log_model
above. Also, I am creating the local environment using pyenv
. Next, I am activating the environment and run:
mlflow models build-docker -m "path_to_model" --enable-mlserver -n "image_name" --env-manager=virtualenv
The model server build runs fine. When I later try to test if the server stands up using:
docker run "image_name"
This where problems starts to occur and I am getting an error, namely:
ModuleNotFoundError: No module named 'my_module'
Clearly the module although present in my env, is not properly transferred during build time. So far I have tried the following:
mymodule
then replacing the -e .
line with the path to the wheel. Relogged the model again to have proper the my_pip_requirements
and the error is still there.local
instead of virtualenv
but then it also omits much more (for instance python version) which becomes a warning but error is still there.PythonModel
per examples in mlflow repo:class PyfuncModelWrapper(mlflow.pyfunc.PythonModel):
def load_context(self, context):
self.model = mlflow.sklearn.load_model(context.artifacts['original_model_path'])
def predict(self, context, model_input, params: Optional[Dict[str, Any]] = None):
return self.model.predict(model_input)
Can you point me in the right direction here? What have worked for you? Is there something obvious that I am missing?
Upvotes: 3
Views: 822
Reputation: 21072
Converting solution response by OP from an edit to the question to an answer:
I did not identify why model does not find the module while I am building a package out of it and installing in the env, that I am building the docker image from, using the commands above. However, I made the model working using
PyfuncModelWrapper(mlflow.pyfunc.PythonModel)
. I had to wrap the model in this wrapper usingmlflow
docs (needload_context
andpredict
methods). Thenmlflow.pyfunc.log_model
has acode_path
argument where I could pass alist
of modules or folders with the modules (namely a package). When model is logged it copies the code as an artifact which I think is a less elegant solution, but it works. The code is carried over the docker image with served model and while starting it I no longer get the error.I am registering both models, the original
sklearn
flavor and the wrapped one, note thatmodel_info
is from the snippet earlier.mlflow.pyfunc.log_model( artifact_path='pyfunc_model', code_path=['./my_package/'], artifacts={'original_model_path': model_info.model_uri}, python_model=PyfuncModelWrapper(), signature=model_signature, pip_requirements=my_pip_requirements, await_registration_for=100, registered_model_name=model_registry_name, )
Upvotes: 0