홍한석
홍한석

Reputation: 481

Docker build context: When it transferred and does it cached by the runtime?

https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#understand-build-context

After read the above doc, I've tested in my local machine. My envs are like:

❯ sw_vers
ProductName:    macOS
ProductVersion: 11.6.1
BuildVersion:   20G224

❯ docker version
Client:
 Cloud integration: v1.0.22
 Version:           20.10.13
 API version:       1.41
 Go version:        go1.16.15
 Git commit:        a224086
 Built:             Thu Mar 10 14:08:44 2022
 OS/Arch:           darwin/amd64
 Context:           default
 Experimental:      true

Server: Docker Desktop 4.6.1 (76265)
 Engine:
  Version:          20.10.13
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.15
  Git commit:       906f57f
  Built:            Thu Mar 10 14:06:05 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.10
  GitCommit:        2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
 runc:
  Version:          1.0.3
  GitCommit:        v1.0.3-0-gf46b6ba
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

I understood that following statements in the documentation means "Even if the Dockerfile does not use files in the build context directly, they still affect to the result image size":

Inadvertently including files that are not necessary for building an image results in a larger build context and larger image size. This can increase the time to build the image, time to pull and push it, and the container runtime size

So, the first test is building two images with the same Dockerfile, one in the build context with the large file and another is not:

❯ cat Dockefile
FROM alpine:3.15.4
RUN ["echo", "a"]

❯ mkdir -p contexts/small contexts/large

❯ touch contexts/small/file

❯ dd if=/dev/urandom of=contexts/large/file bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.404337 s, 259 MB/s

❯ ll **/*/file --block-size=M
-rw-r--r-- 1 hansuk 100M  4 12 14:15 contexts/large/file
-rw-r--r-- 1 hansuk   0M  4 12 14:15 contexts/small/file

❯ tree -a .
.
├── Dockefile
└── contexts
    ├── large
    │   └── file
    └── small
        └── file

3 directories, 3 files

❯ docker build --no-cache -f Dockefile -t small contexts/small
[+] Building 3.0s (6/6) FINISHED
 => [internal] load build definition from Dockefile                                                            0.0s
 => => transferring dockerfile: 78B                                                                            0.0s
 => [internal] load .dockerignore                                                                              0.0s
 => => transferring context: 2B                                                                                0.0s
 => [internal] load metadata for docker.io/library/alpine:3.15.4                                               2.3s
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aa  0.0s
 => [2/2] RUN ["echo", "a"]                                                                                    0.4s
 => exporting to image                                                                                         0.0s
 => => exporting layers                                                                                        0.0s
 => => writing image sha256:695a2b36e3b6ec745fddb3d499acda249f7235917127cf6d68c93cc70665a1dc                   0.0s
 => => naming to docker.io/library/small                                                                       0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

❯ docker build --no-cache -f Dockefile -t large contexts/large
[+] Building 1.4s (6/6) FINISHED
 => [internal] load build definition from Dockefile                                                            0.0s
 => => transferring dockerfile: 78B                                                                            0.0s
 => [internal] load .dockerignore                                                                              0.0s
 => => transferring context: 2B                                                                                0.0s
 => [internal] load metadata for docker.io/library/alpine:3.15.4                                               1.0s
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aa  0.0s
 => [2/2] RUN ["echo", "a"]                                                                                    0.3s
 => exporting to image                                                                                         0.0s
 => => exporting layers                                                                                        0.0s
 => => writing image sha256:7c48ae10200ceceff83c6619e9e74d76b7799032c53fba57fa8adcbe54bb5cce                   0.0s
 => => naming to docker.io/library/large                                                                       0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

❯ docker image list
REPOSITORY               TAG       IMAGE ID       CREATED          SIZE
large                    latest    7c48ae10200c   7 seconds ago    5.57MB
small                    latest    695a2b36e3b6   15 seconds ago   5.57MB

The sizes of result images are exactly same and no difference in the size of 'transferring context'(This message is different from written in the documentation, "Sending build context to Docker daemon ...", but I guess it's about the BuildKit upgraded)

Then, I changed the Dockerfile to use the file for second test:

❯ cat Dockefile
FROM alpine:3.15.4                                                                                                             COPY file /file

❯ docker build --no-cache -f Dockefile -t small contexts/small                                                                 
# ... Eliding unnecessary logs
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4  0.0s
 => [internal] load build context                                                                                         0.0s
 => => transferring context: 25B                                                                                          0.0s
 => [2/2] COPY file /file                                                                                                 0.0s

❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Eliding unnecessary logs
 => [internal] load build context                                                                                         4.1s
 => => transferring context: 104.88MB                                                                                     4.0s
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4  0.0s
 => [2/2] COPY file /file

❯ docker image list
REPOSITORY               TAG       IMAGE ID       CREATED             SIZE
large                    latest    426740a3fa3c   7 minutes ago       110MB
small                    latest    33a32d1e5aab   7 minutes ago       5.57MB

The result sizes are different and it looks like by the file size. But I noticed with more tries to build "large" image, no transferring occured for the same file(context), even with the --no-cache option:

# Run a build again after the previous build
❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Build context is very small
 => [internal] load build context                                                                                         0.0s
 => => transferring context: 28B                                                                                          0.0s
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4  0.0s
 => [2/2] COPY file /file                                                                                                 0.6s
# ...

# Recreate the file
❯ dd if=/dev/urandom of=contexts/large/file bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.374838 s, 280 MB/s

❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Then it transfer the new build context(file)
 => [internal] load build context                                                                                         2.7s
 => => transferring context: 104.88MB                                                                                     2.7s
 => CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4  0.0s
 => [2/2] COPY file /file                                                                                                 0.7s

Sum up my questions:

  1. A size of build contexts only affects to the result image when any context, a file or a directory, is used on the Dockerfile? e.g. COPY build/context /here
    • Is there optimization options or improvements from when the documentation written?
  2. A "build context" is cached (by buildkit or runtime)? I mean not the image layers.

Update

Sending the whole build context can reproduce in the legacy docker builder:

# Create a 1GB file in `small' image build context
❯ dd if=/dev/urandom of=contexts/small/dummy bs=1M count=1024

# Measurement
❯ time sh -c "DOCKER_BUILDKIT=0 docker build -t small -f Dockerfile contexts/small/"
Sending build context to Docker daemon  1.074GB
Step 1/2 : FROM alpine:3.15.4
 ---> 0ac33e5f5afa
Step 2/2 : COPY file /file
 ---> 5b93097102e3
Successfully built 5b93097102e3
Successfully tagged small:latest

real    0m26.639s
user    0m2.809s
sys     0m4.557s

❯ time sh -c "DOCKER_BUILDKIT=0 docker build -t large -f Dockerfile contexts/large/"
Sending build context to Docker daemon  10.49MB
Step 1/2 : FROM alpine:3.15.4
 ---> 0ac33e5f5afa
Step 2/2 : COPY file /file
 ---> Using cache
 ---> 3e24b0f37389
Successfully built 3e24b0f37389
Successfully tagged large:latest

real    0m0.655s
user    0m0.227s
sys     0m0.161s

The question for now is how the Buildkit opimize it.

Upvotes: 1

Views: 3426

Answers (1)

BMitch
BMitch

Reputation: 264761

  1. A size of build contexts only affects to the result image when any context, a file or a directory, is used on the Dockerfile? e.g. COPY build/context /here
    • Is there optimization options or improvements from when the documentation written?

The image will only increase when you copy the file from the build context into the image. However, many images include entire directories without realizing the contents they included. Excluding specific files or subdirectories with a .dockerignore would shrink the image in those cases.

  1. A "build context" is cached (by buildkit or runtime)? I mean not the image layers.

Buildkit changes this process dramatically and the documentation was written from the concept of the classic build tooling. With buildkit, previous versions of the context are cached, and only the files explicitly copied into the image are fetched using something similar to rsync to update it's cache. Note that this doesn't apply when building in ephemeral environments, like a CI server, that creates a new buildkit cache per build.

Upvotes: 2

Related Questions