When pulling a docker image, what is downloaded, and what built locally?

Question

I'm not 100% clear whether this post is appropriate for stack overflow, or if it should go somewhere else - please suggest where else if it shouldn't be here.

I am trying to understand how docker images work. The particular reference in this case is the Dockerfile at https://github.com/frappe/frappe_docker/blob/main/images/production/Containerfile

This file contains VOLUME directives, and some of the RUN commands in it modify the contents of the paths in the VOLUME directives.

If I pull this image from docker hub, what happens with the data in the volumes?

Is it somehow contained in the pulled image? Or are the RUN commands only executed when you start the container in docker?

What happens if you bind mount a local directory onto one of the mount points mentioned in the VOLUME directive when you run the container?

It seems, on experimenting, that bind mounting will replace the data with the contents of the local folder (which means the data created by the RUN commands gets lost).

But, if you use regular named volumes when running the container, the data is there. But I thought regular volumes persist - so what happens when you pull a later version of the container, with perhaps different data in the volume? Does it change the original volume?

[Later]

The docs at https://docs.docker.com/storage/volumes/#:~:text=If%20you%20start%20a%20container%20which%20creates%20a%20new%20volume,are%20copied%20into%20the%20volume. say:

Populate a volume using a container

If you start a container which creates a new volume, and the container has files or directories in the directory to be mounted such as /app/, the directory’s contents are copied into the volume. The container then mounts and uses the volume, and other containers which use the volume also have access to the pre-populated content.

But this does not seem to work with bind mounts if the directory already exists.

[Later] I have done some experiments to compare the behaviour of bind mount and docker volumes.

I used this Dockerfile:

FROM alpine
RUN mkdir /test \
    && echo 'echo "start `date`" >>/test/log' >/runtest \
    && echo 'echo "/log:"' >>/runtest \
    && echo 'cat /log' >>/runtest \
    && echo 'echo "/test/log:"' >>/runtest \
    && echo 'cat /test/log' >>/runtest \
    && echo 'sleep 99999999' >>/runtest \
    && chmod +x /runtest \
    && echo "build `date`" >/test/log \
    && echo "build `date`" >/log \
    && cat /runtest \
    && cat /log \
    && cat /test/log
CMD /runtest
VOLUME "/test"

This creates /test/log in a RUN command, containing the build date. The CMD (which is run every time the container starts) appends the start date to this file. The directory is made a VOLUME (note - this has to be AFTER the RUN command, because the VOLUME copies the current contents of the directory into the volume - any RUN commands after that affect the original directory, but not the volume - see https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#volume

I then ran it with the following docker-compose file:

version: "3"

services:
  tester:
    build: .
    volumes:
      - /test:/test

volumes:
  test:

Build and run first time:

tester_1  | /log:
tester_1  | build Mon Feb 20 08:53:29 UTC 2023
tester_1  | /test/log:
tester_1  | build Mon Feb 20 08:53:29 UTC 2023
tester_1  | start Mon Feb 20 08:53:45 UTC 2023

Stop and run again:

tester_1  | /log:
tester_1  | build Mon Feb 20 08:54:28 UTC 2023
tester_1  | /test/log:
tester_1  | build Mon Feb 20 08:53:29 UTC 2023
tester_1  | start Mon Feb 20 08:53:45 UTC 2023
tester_1  | start Mon Feb 20 08:54:33 UTC 2023

Note /log (not in volume) is overwritten. /test/log (in volume) is not.

I replaced the volume in the docker-compose file with a bind mount.

When I built and ran it, I got a complaint Service "tester" is using volume "/test" from the previous container.

I deleted the docker volume, and tried aqain:

tester_1  | /log:
tester_1  | build Mon Feb 20 09:00:50 UTC 2023
tester_1  | /test/log:
tester_1  | start Mon Feb 20 09:01:03 UTC 2023

Note the result of the RUN command is not copied to the bind mount.

My conclusion is that, if you (or other users of your Dockefile) intent to use bind mounts, then you must not expect the contents of volumes to be copied if those contents are created within the Dockerfile (except, obviously, in the CMD directive).

When pulling a docker image, what is downloaded, and what built locally?

Populate a volume using a container

Answers (1)

Related Questions