Reputation: 727
I have been building some python Docker images recently. Best practice is obviously not to run containers as root user and remove sudo privileges from the non-privileged user.
But I have been wondering what's the best way to go about this.
Here is an example Dockerfile
FROM python:3.10
## get UID/GID of host user for remapping to access bindmounts on host
ARG UID
ARG GID
## add a user with same GID and UID as the host user that owns the workspace files on the host (bind mount)
RUN adduser --uid ${UID} --gid ${GID} --no-create-home flaskuser
RUN usermod -aG sudo flaskuser
## install packages as root?
RUN apt update \
&& apt upgrade -y \
&& apt-get install -y --no-install-recommends python3-pip \
#&& [... install some packages ...]
&& apt-get install -y uwsgi-plugin-python3 \
## cleanup
&& apt-get clean \
&& apt-get autoclean \
&& apt-get autoremove --purge -y \
&& rm -rf /var/lib/apt/lists/*
## change to workspace folder and copy requirements.txt
WORKDIR /workspace/web
COPY ./requirements.txt /tmp/requirements.txt
RUN chown flaskuser:users /tmp/requirements.txt
## Install python packages as root?
RUN python3 -m pip install --disable-pip-version-check --no-cache-dir -r /tmp/requirements.txt
RUN chmod -R 777 /usr/local/lib/python3.11/site-packages/*
ENV PYTHONUNBUFFERED 1
ENV PYTHONPATH "${PYTHONPATH}:/workspace/web"
ENV PYTHONPATH "${PYTHONPATH}:/usr/local/lib/python3.10/site-packages"
## change to non-priviliged user to run container
USER flaskuser
CMD ["uwsgi", "uwsgi.ini"]
So my questions are:
Is installing packages with apt-get as root ok or should these be installed with the non-privileged user (with sudo which later should be removed)?
Best location to install these packages, i.e. /usr/local/ (as default when installing as root) or would it be preferable to install in user home?
When installing python packages with pip as root, I get the following warning
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
(However I don't need a venv since the docker image is already isolated for a single service, so I guess I can just ignore that warning).
Anything else I am missing or should be aware of?
NB: the bind mounted workspace is only for development, for a production image I would copy the necessary files/artifacts into the image/container.
Thanks
Upvotes: 22
Views: 27840
Reputation: 727
Thanks again for your extensive explanations David.
I had to digest all of that and after some more reading on the topic I finally grasped everything you said (so I hope).
The reason I first added the user with a UID/GID matching the host user was, that when I started, I ran my containers on my NAS, which only allows to SSH with root. So running the container with root while the project folder is owned by another user would result in permission issues when the Container-user was trying to access the bind mounted files. Back then I did not quite understand all of that so I carried a false thought along that the container user must always match the host user id.
So I have changed my Dockerfile to use an arbitrary user like you suggested, removed all the unnecessary chown/chmod and I can run this successfully on my local macbook and on a VPS I am currently testing out.
## ################################################################
## WEB Builder Stage
## ################################################################
FROM python:3.10-slim-buster AS builder
## ----------------------------------------------------------------
## Install Packages
## ----------------------------------------------------------------
RUN apt-get update \
&& apt-get install -y libmariadb3 libmariadb-dev \
&& apt-get install -y gcc \
## cleanup
&& apt-get clean \
&& apt-get autoclean \
&& apt-get autoremove --purge -y \
&& rm -rf /var/lib/apt/lists/*
## ----------------------------------------------------------------
## Add venv
## ----------------------------------------------------------------
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
## ----------------------------------------------------------------
## Install python packages
## ----------------------------------------------------------------
COPY ./requirements.txt /tmp/requirements.txt
RUN python3 -m pip install --upgrade pip \
&& python3 -m pip install wheel \
&& python3 -m pip install --disable-pip-version-check --no-cache-dir -r /tmp/requirements.txt
## ################################################################
## Final Stage
## ################################################################
FROM python:3.10-slim-buster
## ----------------------------------------------------------------
## add user so we can run things as non-root
## ----------------------------------------------------------------
RUN adduser flaskuser
## ----------------------------------------------------------------
## Copy from builder and set ENV for venv
## ----------------------------------------------------------------
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
## ----------------------------------------------------------------
## Set Python ENV
## ----------------------------------------------------------------
ENV PYTHONUNBUFFERED=1 \ PYTHONPATH="${PYTHONPATH}:/workspace/web/app:/opt/venv/bin:/opt/venv/lib/python3.10/site-packages"
## ----------------------------------------------------------------
## Copy app files into container
## ----------------------------------------------------------------
WORKDIR /workspace/web
COPY . .
## ----------------------------------------------------------------
## Switch to non-priviliged user and run app
## the entrypoint script runs either uwisg or flask dev server
## depending on FLASK_ENV
## ----------------------------------------------------------------
USER flaskuser
CMD ["/workspace/web/docker-entrypoint.sh"]
If I want to run the container on my NAS (from the NAS host CLI with root) using bind mounts, I can still do so by using a docker-compose.override.yml that will contain
myservice:
user: "{UID}:{GID}"
where "{UID}:{GID}" are matching my host user who owns the bind mounted folder.
But I am also gonna change this. I am developing and testing only locally now and might use my NAS as sort of first integration environment where I will just test the fully built containers/images pulled from a registry (so no need for bind mounts anymore.
I also started to use multistage builds, which, besides making the final images way smaller should hopefully decrease the attack surface by not including unnecessary build dependencies.
Upvotes: -1
Reputation: 159351
In general, the easiest safe approach is to do everything in your Dockerfile as the root user until the very end, at which point you can declare an alternate USER
that gets used when you run the container.
FROM ???
# Debian adduser(8); this does not have a specific known uid
RUN adduser --system --no-create-home nonroot
# ... do the various install and setup steps as root ...
# Specify metadata for when you run the container
USER nonroot
EXPOSE 12345
CMD ["my_application"]
For your more specific questions:
Is installing packages with apt-get as root ok?
It's required; apt-get
won't run as non-root. If you have a base image that switches to a non-root user you need to switch back with USER root
before you can run apt-get
commands.
Best location to install these packages?
The normal system location. If you're using apt-get
to install things, it will put them in /usr
and that's fine; pip install
will want to install things into the system Python site-packages directory; and so on. If you're installing things by hand, /usr/local
is a good place for them, particularly since /usr/local/bin
is usually in $PATH
. The "user home directory" isn't a well-defined concept in Docker and I wouldn't try to use it.
When installing python packages with pip as root, I get the following warning...
You can in fact ignore it, with the justification you state. There are two common paths to using pip
in Docker: the one you show where you pip install
things directly into the "normal" Python, and a second path using a multi-stage build to create a fully-populated virtual environment that can then be COPY
ed into a runtime image without build tools. In both cases you'll still probably want to be root.
Anything else I am missing or should be aware of?
In your Dockerfile:
## get UID/GID of host user for remapping to access bindmounts on host
ARG UID
ARG GID
This is not a best practice, since it means you'll have to rebuild the image whenever someone with a different host uid wants to use it. Create the non-root user with an arbitrary uid, independent from any specific host user.
RUN usermod -aG sudo flaskuser
If your "non-root" user has unrestricted sudo
access, they are effectively root. sudo
has some significant issues in Docker and is never necessary, since every path to run a command also has a way to specify the user to run it as.
RUN chown flaskuser:users /tmp/requirements.txt
Your code and other source files should have the default root:root
ownership. By default they will be world-readable but not writeable, and that's fine. You want to prevent your application from overwriting its own source code, intentionally or otherwise.
RUN chmod -R 777 /usr/local/lib/python3.11/site-packages/*
chmod 0777
is never a best practice. It gives a place for unprivileged code to write out their malware payloads and execute them. For a typical Docker setup you don't need chmod
at all.
The bind mounted workspace is only for development, for a production image I would copy the necessary files/artifacts into the image/container.
If you use a bind mount to overwrite all of the application code with content from the host, then you're not actually running the code from the image, and some or all of the Dockerfile's work will just be lost. This means that, when you go to production without the bind mount, you're running an untested setup.
Since your development environment will almost always be different from your production environment in some way, I'd recommend using a non-Docker Python virtual environment for day-to-day development, have good (pytest
) unit tests that can run outside the container, and do integration testing on the built container before deploying.
Permission issues can also come up if your application is trying to write out files to a host directory. The best approach here is to restructure your application to avoid it, storing the data somewhere else, like a relational database. In this answer I discuss permission setup for a bind-mounted data directory, though that sounds a little different from what you're asking about here.
Upvotes: 31