ilia
ilia

Reputation: 1132

How to split docker layer?

My Dockerfile looks like

RUN echo "Downloading huge file" && \
  wget   http://server/huge.tar.gz  && \
  echo "Extracting huge file" && \
  tar xzf huge.tar.gz  && \
  huge/install /opt/myinstall && \
  rm -rf huge*

Actually, I am

  1. Downloading some third party installation package
  2. Unpacking
  3. Running install
  4. Removing installation files

Docker build succeeds and I can use my new container.

Problem starts when I am pushing to Amazon Container registry.

Push is rejected because last layer is huge ( ~about 20G).

20G is a real size of installation, so I can do a little to decrease it.

My question is how can I split a layer to some smaller layers to accommodate Amazon layer size limit?

Upvotes: 2

Views: 5290

Answers (2)

Jie Wu
Jie Wu

Reputation: 192

I have a same problem image which has a layer size>7GB. Much unfortunately, my huge file is a single binary file.

First, I use split command to make file parts:

split -b 1000M huge.bin part_

In Dockerfile, replace huge file copy with:

ADD data/part_aa /data/
ADD data/part_ab /data/
ADD data/part_ac /data/
ADD data/part_ad /data/
ADD data/part_ae /data/
ADD data/part_af /data/
ADD data/part_ag /data/

In entrypoint.sh whitch is startup script, Add following command lines to combine:

cd /data
MODEL_FILE="huge.bin"
if [ ! -f "$MODEL_FILE" ]; then
  echo "combine model file parts, this may take 5 minutes"
  cat part_* > $MODEL_FILE
  echo "combine model file parts done"
fi

For less copy and download, use wget to speed up in Dockerfile:

RUN wget -P /data http://192.168.1.111/files/data/part_aa
RUN wget -P /data http://192.168.1.111/files/data/part_ab
RUN wget -P /data http://192.168.1.111/files/data/part_ac
RUN wget -P /data http://192.168.1.111/files/data/part_ad
RUN wget -P /data http://192.168.1.111/files/data/part_ae
RUN wget -P /data http://192.168.1.111/files/data/part_af
RUN wget -P /data http://192.168.1.111/files/data/part_ag
  • WARNING: wget only cache url sum but file sum whitch Add does

Upvotes: 2

yamenk
yamenk

Reputation: 51866

A new layer is created on each dockerfile instruction. So the solution is to split the RUN command into multiple RUN commands. However, I am not sure that this solution will work in your case if the tar is very big, as one of the layers will contain the tar. Nonetheless, you should try this approach.

RUN wget http://server/huge.tar.gz
RUN tar xzf huge.tar.gz
RUN huge/install /opt/myinstall && \
RUN rm -rf huge*

Another alternative is to use docker multistage build. The idea is to install the tar in a separate container and just copy the installation directory to you container:

FROM ... as installer
RUN echo "Downloading huge file" && \
  wget   http://server/huge.tar.gz  && \
  echo "Extracting huge file" && \
  tar xzf huge.tar.gz  && \
  huge/install /opt/myinstall && \
  rm -rf huge*

FROM ...
COPY --from=installer /opt/myinstall /opt/myinstall
...

That way you will only have one layer in your image which only copies the installation.

Upvotes: 3

Related Questions