Reputation: 9377
As far as I understand build stages in Docker are fundamental things, and I have a practical understanding of them but I have trouble coming up with a proper definition, and I also can't seem to find one.
So: what is the definition of a Docker build stage?
Edit: I'm not asking "how do I use a build stage?" or "how can I use multi-build stages?" which people seem very eager to answer :-)
The reason I have this question is because I saw the following sentences in the docs:
Which left me wondering: what exactly is a build stage?
Upvotes: 6
Views: 2969
Reputation: 25122
I don't think there will ever be a strict definition for Docker build stage because a build stage is in general something theoretical which:
In this question: Difference between build and deploy? one of the answers says...
Build means to Compile the project.
I think you can see it this way too. A build stage is any procedure that generates something which can later be taken and used.
The idea with docker multi-stage builds is to:
If you have read the docs, Alex Ellis has a nice example where the same logic takes place:
golang
image, adds libraries, builds his app (Go generates a binary executable file)alpine
image, adds the executable file from step 1 and ships his app with an image that has much smaller size.Upvotes: 2
Reputation: 263469
A stage is the creation an image. In a multi-stage build, you go through the process of creating more than one image, however you typically only tag a single one (exceptions being multiple builds, building a multi-architecture image manifest with a tool like buildx, and anything else docker releases after this answer).
Each stage, building a distinct image, starts from a FROM
line in the Dockerfile. One stage doesn't inherit anything done in previous stages, it is based on its own base image. So if you have the following:
FROM alpine as stage1
RUN apk add your_tool
FROM alpine as stage2
RUN your_tool some args
you will get an error since your_tool
is not installed in the second stage.
Which stage do you get as output from the build? By default the last stage, but you can change that with the docker image build --target stage1 .
to build the stage with the name, stage1
in this example. The classic docker build will run from the top of the Dockerfile until if finishes the target stage. Buildkit builds a dependency graph and builds stages concurrently and only if needed, so do not depend on this ordering to control something like a testing workflow in your Dockerfile (buildkit can see if nothing in the test stage is needed in your release stage and skip building the test).
What's the value of multiple stages? Typically, its done to separate the build environment from the runtime environment. It allows you to perform the entire build inside of docker. This has two advantages.
First, you don't require an external Makefile and various compilers and other tools installed on the host to compile the binaries that then get copied into the image with a COPY
line, anyone with docker can build your image.
And second, the resulting image doesn't include all the compilers or other build time tooling that isn't needed at runtime, resulting in smaller and more secure images. The typical example is a java app with maven and a full JDK to build, a runtime with just the jar file and the JRE.
If each stage makes a separate image, how do you get the jar file from the build stage to the run stage? That comes from a new option to the COPY
command, --from
. An oversimplified multi-stage build looks like:
FROM maven as build
COPY src /app/src
WORKDIR /app/src
RUN mvn install
FROM openjdk:jre as release
COPY --from=build /app/src/target/app.jar /app
CMD java -jar /app/app.jar
With that COPY --from=build
we are able to take the artifact built in the build stage and add it to the release stage, without including anything else from that first stage (no layers of compile tools like JDK or Maven get added to our second stage).
How is the FROM x as y
and the COPY --from=y /a /b
working together? The FROM x as y
is defining an image name for the duration of this build, in this case y
. Anywhere later in the Dockerfile that you would put an image name, you can put y
and you'll get the result of this stage as your input. So you could say:
FROM upstream as mybuilder
RUN apk add common_tools
FROM mybuilder as stage2
RUN some_tool arg2
FROM mybuilder as stage3
RUN some_tool arg3
FROM minimal_base as release
COPY --from=stage2 /bin2 /
COPY --from=stage3 /bin3 /
Note how stage2
and stage3
are each FROM mybuilder
that is the output of the first stage.
The COPY --from=y
allows you to change the context where you are copying from to be another image instead of the build context. It doesn't have to be another stage. So, for example, you could do the following to get a docker binary in your image:
FROM alpine
COPY --from=docker:stable /usr/local/bin/docker /usr/local/bin/
Further documentation on this is available at: https://docs.docker.com/develop/develop-images/multistage-build/
Upvotes: 4
Reputation: 111
a build stage starts at a FROM statement and ends at the step before the next FROM statement
Upvotes: 1
Reputation: 521995
stage | steɪdʒ |
noun
a point, period, or step in a process or development
Take a practical example: you want to build an image which contains a production ready web server with Typescript files compiled to Javascript. You want to build that Typescript within a Docker container to simplify dependency management. So you need:
In your final image you only really need the compiled .js files and, say, nginx. But to get there, you need all that other stuff first. When you upload that final image, it will contain all the intermediate layers, even if they're unnecessary for the final product.
Docker build stages now allow you to actually separate those stages, or steps, into separate images, while still using just one Dockerfile and not needing to glue several Dockerfiles together with external shell scripts or such. E.g.:
FROM node as builder
RUN npm install ...
# whatever you need to build your files
FROM nginx as production
COPY --from=builder /final.js /var/www/html
The final result of this Dockerfile is a small image with nginx
as its base plus just the final .js file. It does not contain all the unnecessary stuff like node.js and the npm dependencies.
builder
here is the first stage, production
is the second stage. In this case the first stage will be discarded at the end of the process, but you can also choose to build a specific stage using docker build --target=builder
. A new FROM
introduces a new, separate stage. They're essentially separate Dockerfiles, but they can share data using COPY --from
.
Upvotes: 0
Reputation: 1092
Since version 17, docker
now supports multiple stages during a docker build
executions.
This means, that you no longer need to define only one source image in your docker file and do the whole build in a single run, but you can define multiple stages with different images in your Dockerfile
for each stage with multiple FROM
definitions:
# Build stage
FROM microsoft/aspnetcore
# ..do a build with a dev image for creating ./app artifact
# Publish - use a hardened, production image
FROM alpine:latest
CMD ["./app"]
This gives you the benefit to break your image building process to be optimized for a task that you are doing in a stage - for example the stages could be:
Read more in details about multistage-build
:
Upvotes: 2