user10664542
user10664542

Reputation: 1306

What is the advantage of a docker multi-stage build in this example below (why is it done this way?)

Can anyone explain the advantages of multi-stage builds, especially what is going on here in this specific Dockerfile example?

ref: Section titled: Create an image from an alternative base image


Question:

What advantage does this approach have:

FROM python:buster as build-image
ARG FUNCTION_DIR="/function"

<instructions>

FROM python:buster
# Copy in the build image dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
<more instructions>

over this approach

FROM python:buster
<all needed instructions>

I'm not seeing the advantage, or why this approach would be taken but I do not doubt there is some advantage to this approach.


copy of Dockerfile from link above

# Define function directory
ARG FUNCTION_DIR="/function"

FROM python:buster as build-image

# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
  apt-get install -y \
  g++ \
  make \
  cmake \
  unzip \
  libcurl4-openssl-dev

# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy function code
COPY app/* ${FUNCTION_DIR}

# Install the runtime interface client
RUN pip install \
        --target ${FUNCTION_DIR} \
        awslambdaric

# Multi-stage build: grab a fresh copy of the base image
FROM python:buster

# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

# Copy in the build image dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}

ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
CMD [ "app.handler" ]

Upvotes: 1

Views: 650

Answers (1)

David Maze
David Maze

Reputation: 158656

A complete C toolchain is quite large; guesstimating, an Ubuntu-based python image probably triples in size if you install g++, make, and the required C header files. You only need this toolchain to build C extensions but you don't need it once it's built.

So the multi-stage build here runs in two parts:

  1. Install the full C toolchain. Build Python packages with C extensions and put them in a known directory. This is very large but is only an intermediate step.
  2. Start from a plain Python image, without the toolchain, and copy the built libraries from the first image. This is more moderately sized and is what is eventually run and redistributed.

This is a very typical use of a multi-stage build, and you see similarities to it in many languages. In Go you can have a first stage that builds a statically-linked binary and then a final stage that only contains the binary and not the compiler; in Node you can have a first stage that includes things like the Typescript compiler and a final stage that only includes production dependencies; and so on.

Upvotes: 3

Related Questions