Paebbels
Paebbels

Reputation: 16249

How to split a 60GB docker layer to achieve better performance?

I'm installing AMD/Xilinx Vivado in an Docker image. Even an almost minimal installation is 60 GB in size resulting in a 29GB compressed image. A full installation is something around 150 GB...

The image is created by:

  1. Using Debian Bookworm (debian:bookworm-slim) from Docker Hub
  2. Adding Python 3.12 (actually using python:3.12-slim from Docker Hub)
  3. Adding necessary packages needed by AMD Vivado via apt install ....
  4. Installing AMD Vivado as an almost minimal setup via ./xsetup ....

When the software is installed, I noticed:

  1. Upload Problems
    • dockerd pushes with around 135 Mb/s in a 1GbE network setup.
  2. Download Problems
    • dockerd is limited to max 3 download threads in parallel. Compared to docker push, the docker pull achieves 975 Mb/s (max speed in 1GbE). See Docker parallel operations limit
    • The downloaded layer is not extracted on the fly. It needs a full download to extract the layer.

I found hints, that splitting big layers into multiple layers will improve performance. So I used a multi-stage build process where Vivado is installed in stage-1 and then stage-2 is used to mount the layers from stage-1 and a RUN --mount... cp -r SRC DEST is used to create 15 layers between 1.5 and 8.8 GB.

The results show this:

  1. For docker push
    • dockerd is limited to max 5 upload threads in parallel.
      See Docker parallel operations limit
    • The parallel upload results in around 550 Mb/s, so 5x the speed of a single upload thread.
  2. For docker pull
    • A download of one big 60 GB layer is equally slow as a download of 15 layers using 3 parallel download threads. A download takes 5 minutes. This process is limited by the 1GbE network setup.
    • A 15 layered 60GB image is fast, because finished layers are extracted by another single (!) thread, while other layers are still downloaded. Overall it took 5 minutes download time + 2 minutes to extract the remaining layers after all layers where downloaded.
    • A big layered 60GB image is twice as slow, because it downloads the same data in 5 minutes, but then runs a single threaded extraction which takes 8 minutes. Resulting in a total of 13 vs. 7 minutes.

So here is my question:

I currently thought about these first ideas:

Manually created Dockerfile to check performance differences:

ARG REGISTRY
ARG NAMESPACE
ARG IMAGE
ARG IMAGE_TAG
ARG VIVADO_VERSION

FROM ${REGISTRY}/${NAMESPACE}/${IMAGE}:${IMAGE_TAG} as base
ARG VIVADO_VERSION

# Install further dependencies for Vivado
RUN --mount=type=bind,target=/context \
    apt-get update \
 && xargs -a /context/Vivado.packages apt-get install -y --no-install-recommends \
 && rm -rf /var/lib/apt/lists/* \
 && apt-get clean


FROM base as monolithic
ARG VIVADO_VERSION

RUN --mount=type=bind,target=/context \
    --mount=type=bind,from=vivado,target=/Install \
    cd /Install; \
    ./xsetup --batch Install -c /context/Vivado.${VIVADO_VERSION}.cfg --agree XilinxEULA,3rdPartyEULA


FROM base
ARG VIVADO_VERSION

RUN mkdir -p /opt/Xilinx/Vivado/${VIVADO_VERSION}/data/parts/xilinx

RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/gnu                           /opt/Xilinx/Vivado/${VIVADO_VERSION}
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/ids_lite                      /opt/Xilinx/Vivado/${VIVADO_VERSION}
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/lib                           /opt/Xilinx/Vivado/${VIVADO_VERSION}
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/tps                           /opt/Xilinx/Vivado/${VIVADO_VERSION}
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/secureip                 /opt/Xilinx/Vivado/${VIVADO_VERSION}/data
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/xsim                     /opt/Xilinx/Vivado/${VIVADO_VERSION}/data
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/deca                     /opt/Xilinx/Vivado/${VIVADO_VERSION}/data
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/ip                       /opt/Xilinx/Vivado/${VIVADO_VERSION}/data
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/simmodels                /opt/Xilinx/Vivado/${VIVADO_VERSION}/data
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/parts/xilinx/zynquplus   /opt/Xilinx/Vivado/${VIVADO_VERSION}/data/parts/xilinx
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/parts/xilinx/virtexuplus /opt/Xilinx/Vivado/${VIVADO_VERSION}/data/parts/xilinx
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/parts/xilinx/kintexuplus /opt/Xilinx/Vivado/${VIVADO_VERSION}/data/parts/xilinx
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -r /Install/data/parts/xilinx/common      /opt/Xilinx/Vivado/${VIVADO_VERSION}/data/parts/xilinx
RUN --mount=type=bind,from=monolithic,source=/opt/Xilinx/Vivado/${VIVADO_VERSION},target=/Install cp -ru /Install/*                            /opt/Xilinx/Vivado/${VIVADO_VERSION}

# Configure Vivado tools
COPY FlexLM.config.sh Vivado.config.sh /tools/GitLab-CI-Scripts/

Upvotes: 3

Views: 411

Answers (0)

Related Questions