Reputation: 481
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#understand-build-context
After read the above doc, I've tested in my local machine. My envs are like:
❯ sw_vers
ProductName: macOS
ProductVersion: 11.6.1
BuildVersion: 20G224
❯ docker version
Client:
Cloud integration: v1.0.22
Version: 20.10.13
API version: 1.41
Go version: go1.16.15
Git commit: a224086
Built: Thu Mar 10 14:08:44 2022
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Desktop 4.6.1 (76265)
Engine:
Version: 20.10.13
API version: 1.41 (minimum version 1.12)
Go version: go1.16.15
Git commit: 906f57f
Built: Thu Mar 10 14:06:05 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.5.10
GitCommit: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc:
Version: 1.0.3
GitCommit: v1.0.3-0-gf46b6ba
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I understood that following statements in the documentation means "Even if the Dockerfile does not use files in the build context directly, they still affect to the result image size":
Inadvertently including files that are not necessary for building an image results in a larger build context and larger image size. This can increase the time to build the image, time to pull and push it, and the container runtime size
So, the first test is building two images with the same Dockerfile, one in the build context with the large file and another is not:
❯ cat Dockefile
FROM alpine:3.15.4
RUN ["echo", "a"]
❯ mkdir -p contexts/small contexts/large
❯ touch contexts/small/file
❯ dd if=/dev/urandom of=contexts/large/file bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.404337 s, 259 MB/s
❯ ll **/*/file --block-size=M
-rw-r--r-- 1 hansuk 100M 4 12 14:15 contexts/large/file
-rw-r--r-- 1 hansuk 0M 4 12 14:15 contexts/small/file
❯ tree -a .
.
├── Dockefile
└── contexts
├── large
│ └── file
└── small
└── file
3 directories, 3 files
❯ docker build --no-cache -f Dockefile -t small contexts/small
[+] Building 3.0s (6/6) FINISHED
=> [internal] load build definition from Dockefile 0.0s
=> => transferring dockerfile: 78B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.15.4 2.3s
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aa 0.0s
=> [2/2] RUN ["echo", "a"] 0.4s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:695a2b36e3b6ec745fddb3d499acda249f7235917127cf6d68c93cc70665a1dc 0.0s
=> => naming to docker.io/library/small 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
❯ docker build --no-cache -f Dockefile -t large contexts/large
[+] Building 1.4s (6/6) FINISHED
=> [internal] load build definition from Dockefile 0.0s
=> => transferring dockerfile: 78B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.15.4 1.0s
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aa 0.0s
=> [2/2] RUN ["echo", "a"] 0.3s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:7c48ae10200ceceff83c6619e9e74d76b7799032c53fba57fa8adcbe54bb5cce 0.0s
=> => naming to docker.io/library/large 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
❯ docker image list
REPOSITORY TAG IMAGE ID CREATED SIZE
large latest 7c48ae10200c 7 seconds ago 5.57MB
small latest 695a2b36e3b6 15 seconds ago 5.57MB
The sizes of result images are exactly same and no difference in the size of 'transferring context'(This message is different from written in the documentation, "Sending build context to Docker daemon ...", but I guess it's about the BuildKit upgraded)
Then, I changed the Dockerfile to use the file for second test:
❯ cat Dockefile
FROM alpine:3.15.4 COPY file /file
❯ docker build --no-cache -f Dockefile -t small contexts/small
# ... Eliding unnecessary logs
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 25B 0.0s
=> [2/2] COPY file /file 0.0s
❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Eliding unnecessary logs
=> [internal] load build context 4.1s
=> => transferring context: 104.88MB 4.0s
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4 0.0s
=> [2/2] COPY file /file
❯ docker image list
REPOSITORY TAG IMAGE ID CREATED SIZE
large latest 426740a3fa3c 7 minutes ago 110MB
small latest 33a32d1e5aab 7 minutes ago 5.57MB
The result sizes are different and it looks like by the file size. But I noticed with more tries to build "large" image, no transferring occured for the same file(context), even with the --no-cache
option:
# Run a build again after the previous build
❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Build context is very small
=> [internal] load build context 0.0s
=> => transferring context: 28B 0.0s
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4 0.0s
=> [2/2] COPY file /file 0.6s
# ...
# Recreate the file
❯ dd if=/dev/urandom of=contexts/large/file bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.374838 s, 280 MB/s
❯ docker build --no-cache -f Dockefile -t large contexts/large
# ... Then it transfer the new build context(file)
=> [internal] load build context 2.7s
=> => transferring context: 104.88MB 2.7s
=> CACHED [1/2] FROM docker.io/library/alpine:3.15.4@sha256:4edbd2beb5f78b1014028f4fbb99f3237d9561100b6881aabbf5acce2c4 0.0s
=> [2/2] COPY file /file 0.7s
Sum up my questions:
COPY build/context /here
Sending the whole build context can reproduce in the legacy docker builder:
# Create a 1GB file in `small' image build context
❯ dd if=/dev/urandom of=contexts/small/dummy bs=1M count=1024
# Measurement
❯ time sh -c "DOCKER_BUILDKIT=0 docker build -t small -f Dockerfile contexts/small/"
Sending build context to Docker daemon 1.074GB
Step 1/2 : FROM alpine:3.15.4
---> 0ac33e5f5afa
Step 2/2 : COPY file /file
---> 5b93097102e3
Successfully built 5b93097102e3
Successfully tagged small:latest
real 0m26.639s
user 0m2.809s
sys 0m4.557s
❯ time sh -c "DOCKER_BUILDKIT=0 docker build -t large -f Dockerfile contexts/large/"
Sending build context to Docker daemon 10.49MB
Step 1/2 : FROM alpine:3.15.4
---> 0ac33e5f5afa
Step 2/2 : COPY file /file
---> Using cache
---> 3e24b0f37389
Successfully built 3e24b0f37389
Successfully tagged large:latest
real 0m0.655s
user 0m0.227s
sys 0m0.161s
The question for now is how the Buildkit opimize it.
Upvotes: 1
Views: 3426
Reputation: 264761
- A size of build contexts only affects to the result image when any context, a file or a directory, is used on the Dockerfile? e.g.
COPY build/context /here
- Is there optimization options or improvements from when the documentation written?
The image will only increase when you copy the file from the build context into the image. However, many images include entire directories without realizing the contents they included. Excluding specific files or subdirectories with a .dockerignore
would shrink the image in those cases.
- A "build context" is cached (by buildkit or runtime)? I mean not the image layers.
Buildkit changes this process dramatically and the documentation was written from the concept of the classic build tooling. With buildkit, previous versions of the context are cached, and only the files explicitly copied into the image are fetched using something similar to rsync
to update it's cache. Note that this doesn't apply when building in ephemeral environments, like a CI server, that creates a new buildkit cache per build.
Upvotes: 2