Tyler Cardon
Tyler Cardon

Reputation: 51

How to manage Python package versions installed in docker images and in running containers

I want to control and monitor the versions of custom python packages installed on containers. This is for controlled, incremental releases of new custom python packages into low risk or test containers prior to deploying them to more high risk or production containers.

For example, I might want to update a fraction of the containers with the new Python packages iteratively, until all of them are updated.

Without knowing about a utility to do this for me, I have an idea which might explain some of what I want to accomplish-

The best idea I can come up with is to make the Python package versions an argument during the image build, pass those versions to pip in the Dockerfile, and register those versions in a text file (e.g. MYPACKAGES.json) on the image as the last step of the Dockerfile. That allows me to run a command (cat MYPACKAGES.json) across all of my containers to print all of those Python package versions from the running containers and choose which ones need to and should be upgraded.

Is there a cleaner way to do this with an existing utility?

Upvotes: 1

Views: 3082

Answers (1)

David Maze
David Maze

Reputation: 158908

The standard Python packaging tools already essentially support the workflow you're describing. If you're using "classic" setup.py, then that file will list packages your application directly depends on and a range of acceptable versions; you can then run pip freeze to create a requirements.txt file that includes every package you directly or indirectly use and exact versions. Similarly, if you're using Pipenv, its Pipfile and Pipfile.lock carry version ranges and exact versions.

When you build your Docker image, you should make sure you have the lock file to get the exact versions.

FROM python:3.9
WORKDIR /app
COPY requirements.txt .  # with exact versions
RUN pip install -r requirements.txt
COPY . .
RUN pip install .        # also install application into "system" Python
CMD ["my_app"]

This implies using a non-Docker virtual environment for day-to-day development, unit testing, and pre-deployment testing. This is probably a good idea in any case, since Docker's isolation mechanisms can add some significant challenges if you're trying to use tools that only live inside a container.

I might want to update a fraction of the containers with the new Python packages iteratively, until all of them are updated.

If you otherwise have good Kubernetes practices, this is possible.

In the very simplest version of this, you'll detect some compatibility problem at startup time, and new pods will just crash if the library stack doesn't work. In that case you don't need to do much. Make sure every build of your application has a distinct Docker image tag, and change the image: in your Deployment spec. Kubernetes on its own will start deploying new Pods with the new image, and will only tear down old Pods when the new ones successfully pass their health checks.

You could do something similar with two separate Deployments if you wanted more control over the process. You could label them for example "blue" and "green", make the same Service select both sets of Pods, and use kubectl scale deployment to change the number of replicas of each until you're fully switched over.

If you do build exact library dependencies into your Docker images, and you do give each image a distinct tag, then a still better path is to set up an integration-test environment. Run a complete copy of the application there including your proposed library updates, and make sure you have good enough tests to be able to know whether it works or not. If you run successfully in this environment (especially with library updates) it's very likely then you'll also run identical images successfully in production, so you're probably okay to jump straight to the "update the Deployment" step without the more complex blue/green deployment setup.

Upvotes: 3

Related Questions