ebbishop
ebbishop

Reputation: 1983

Uninstall applications within docker?

I am using docker to build and run a set of services. For one of these, I need to install a number of packages in order to complete an internal (gulp.js) build process, but they aren't necessary in the final docker image. It seems as though I should uninstall them while building the docker image, to keep the images smaller. Is there a standard practice for this?

Related: What is a reasonable size for a docker image? I realize "reasonable" is subjective, but does 1.4 GB sound wild?

This is my Dockerfile as it exists now:

FROM python:3.5.1-onbuild
ARG mode=dev

ENV MODE $mode

RUN apt-get update && apt-get install -y \
  build-essential \
  curl \
  && rm -rf /var/lib/apt/lists/*
RUN curl -sL https://deb.nodesource.com/setup_6.x | bash
RUN apt-get install -y nodejs
RUN cd $(npm root -g)/npm \
  && npm install fs-extra \
  && sed -i -e s/graceful-fs/fs-extra/ -e s/fs\.rename/fs.move/ ./lib/utils/rename.js


WORKDIR /usr/src/app/app
RUN npm install
RUN npm install --global gulp-cli
RUN npm rebuild node-sass
RUN gulp
RUN rm -rf node_modules

WORKDIR /usr/src/app
EXPOSE 8080
RUN python3 setup.py install
CMD python3 manage.py ${MODE}

Upvotes: 0

Views: 464

Answers (1)

helmbert
helmbert

Reputation: 38014

Docker images use a layered filesystem. Each statement in a Dockerfile will cause a new layer to be added to the generated image. You can inspect these layers with the docker history command.

Removing packages in a build step will not reduce the overall image size, because the removed files will still be present within the parent layer (although the files will be marked as deleted and not be present when you create a container from this image).

Example

Consider a short example:

FROM debian:8
RUN dd if=/dev/zero of=/tmp/test bs=1024 count=100K
RUN rm /tmp/test

This example Dockerfile creates an image with a 100 MiB file in the first RUN statement and removes it in the second RUN statement. Now inspect the created image with docker history <image-id>:

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
02b8c1077ac1        8 seconds ago       /bin/sh -c rm /tmp/test                         0 B                 
64a98e72b7ca        8 seconds ago       /bin/sh -c dd if=/dev/zero of=/tmp/test bs=10   104.9 MB            
737e719de6a4        4 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0 B                 
1f2a121fc3e4        4 weeks ago         /bin/sh -c #(nop) ADD file:41ea5187c50116884c   123 MB      

As you can see, the 100MiB layer is still there. The resulting image has an overall size of roughly 200MB:

REPOSITORY         TAG      IMAGE ID            CREATED             VIRTUAL SIZE
<none>             <none>   02b8c1077ac1        8 minutes ago       227.9 MB

How to fix

I see two possibilities to reduce your image size:

  1. Make sure your npm install command and the subsequent rm -rf node_modules run in the same build step:

    RUN npm install && \
        npm install --global gulp-cli && \
        npm rebuild node-sass && \
        gulp && \
        rm -rf node_modules
    

    This statement will only create a single filesystem layer in the resulting image, without the node_modules directory.

  2. "Squash" the image layers into one. Tools like docker-squash may help with this. Note that squashing image layers makes the (usually extremely efficient) docker push and docker pull operations less efficient. Typically, these operations only transfer the specific layers of an image that are not already present on the local (when pulling) or remote (when pushing) side. When your image only consists of one layer, docker push and pull will always transfer the entire image.

Upvotes: 2

Related Questions