Michal M
Michal M

Reputation: 31

Proper way of making my python module available to the mlflow during mlflow models build-docker

I am trying to build a docker image that I could host with an endpoint of my model and have experienced issues on how to make my code available during the build so that the image would later run and serve the model.

I am trying a model which is a sklearn type of classifier, however it is wrapped in my modified version of sklearn.Pipeline class, let's call it a BypassablePipeline. The model trains fine, I am logging it into the mlflow tracking server with, I am also able to pull it again and execute it locally. This is how I log the model:

   model_info = mlflow.sklearn.log_model(model,
                                     artifact_path='model',
                                     registered_model_name=model_registry_name,
                                     signature=model_signature, 
                                     pip_requirements=my_pip_requirements)

I have the BypassablePipeline in mymodule and installed it in development mode, my requirements.txt file example, omitting other packages for simplicity sake:

scikit_learn==1.2.0
mlflow==2.1.1
-e .

. is where I locally have the setup.py for my mymodule and installs fine. When I execute pip list this is what I am getting:

scikit_learn 1.2.0
mlflow       2.1.1
mymodule     0.0.1 /home/my_user/my_repo/

plus it is working locally.

These requirements are passed as my_pip_requirements to the log_model above. Also, I am creating the local environment using pyenv. Next, I am activating the environment and run: mlflow models build-docker -m "path_to_model" --enable-mlserver -n "image_name" --env-manager=virtualenv The model server build runs fine. When I later try to test if the server stands up using: docker run "image_name" This where problems starts to occur and I am getting an error, namely: ModuleNotFoundError: No module named 'my_module'

Clearly the module although present in my env, is not properly transferred during build time. So far I have tried the following:

  1. Prior to the docker build using the above command I am creating a fresh environment and building a wheel package out of mymodule then replacing the -e . line with the path to the wheel. Relogged the model again to have proper the my_pip_requirements and the error is still there.
  2. Tried local instead of virtualenv but then it also omits much more (for instance python version) which becomes a warning but error is still there.
  3. Wrapped the model with PythonModel per examples in mlflow repo:
class PyfuncModelWrapper(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        self.model = mlflow.sklearn.load_model(context.artifacts['original_model_path'])
        
    def predict(self, context, model_input, params: Optional[Dict[str, Any]] = None):
        return self.model.predict(model_input)

Can you point me in the right direction here? What have worked for you? Is there something obvious that I am missing?

Upvotes: 3

Views: 822

Answers (1)

TylerH
TylerH

Reputation: 21072

Converting solution response by OP from an edit to the question to an answer:

I did not identify why model does not find the module while I am building a package out of it and installing in the env, that I am building the docker image from, using the commands above. However, I made the model working using PyfuncModelWrapper(mlflow.pyfunc.PythonModel). I had to wrap the model in this wrapper using mlflow docs (need load_context and predict methods). Then mlflow.pyfunc.log_model has a code_path argument where I could pass a list of modules or folders with the modules (namely a package). When model is logged it copies the code as an artifact which I think is a less elegant solution, but it works. The code is carried over the docker image with served model and while starting it I no longer get the error.

I am registering both models, the original sklearn flavor and the wrapped one, note that model_info is from the snippet earlier.

mlflow.pyfunc.log_model(
   artifact_path='pyfunc_model',
   code_path=['./my_package/'],
   artifacts={'original_model_path': model_info.model_uri},
   python_model=PyfuncModelWrapper(),
   signature=model_signature,
   pip_requirements=my_pip_requirements,
   await_registration_for=100,
   registered_model_name=model_registry_name,
)

Upvotes: 0

Related Questions