Andrew
Andrew

Reputation: 609

speeding up 'apt-get update' to speed up Docker image builds

I want to add curl to a Docker image, and I'm using the following commands to in a Dockerfile to do so:

RUN apt-get update
RUN apt-get install curl ca-certificates -y

My issue is that the initial update takes a pretty long time to run (2 min), so while I'm debugging my Dockerfile, iteration is slow. In particular when I make changes before the RUN apt-get update, which invalidate Docker's image cache.

Is there any way to be more selective with apt-get update, so it only updates enough to index where to download curl? Or some other technique I can use to speed up my Docker builds?

Here is the whole Dockerfile,

FROM postgres:9.6.10
ADD data/tsvs.tar.gz /standard_data
COPY postgres/*.sql /docker-entrypoint-initdb.d/

RUN apt-get update
RUN apt-get install curl ca-certificates -y
RUN curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
RUN apt-get install postgis postgresql-9.6-postgis-scripts -y

I'm currently making changes to the SQL files in postgres/*.sql, hence the cache invalidation.

Upvotes: 5

Views: 16783

Answers (5)

Damir Nafikov
Damir Nafikov

Reputation: 332

this answer helped me: https://stackoverflow.com/a/66714888/11285211

by specifying network as host in build: docker build --network host ...
or in compose:

service_name:
    container_name: name
    build: 
      context: .
      network: host

Upvotes: 2

soulphoenix
soulphoenix

Reputation: 236

You can use sed to replace the sources.list links with mirrors so it always choose the best mirrors when running apt.

For Ubuntu, you can add this to your docker file before running apt,

RUN sed -i 's/htt[p|ps]:\/\/archive.ubuntu.com\/ubuntu\//mirror:\/\/mirrors.ubuntu.com\/mirrors.txt/g' /etc/apt/sources.list

This would relpace the default http(s)://archive.ubuntu.com with mirror://mirrors.ubuntu.com/mirrors.txt in your sources.list file in-place using sed.

Tested for working on Ubuntu 20.04 docker.

Note: You might need to install ca-certificates to remove certificate errors when running apt.

Upvotes: 4

Rufus
Rufus

Reputation: 5566

This post on reddit suggests that copying your local apt sources.list to the container with COPY sources.list /etc/apt/ may help the container's apt update to use local mirrors which may speeds things up

Upvotes: 0

Mihai
Mihai

Reputation: 10727

An image is organized in layers and each layer depends on the previous later. Layers are also cached for speed.

When you run the build again do keep checks if the ch ckecksum of a command line in the dockerfile changed. If it didn't then it pulls the layer from cache. But if it did then it rebuilds the later and all successive layers.

In your particular case the ADD command generates a new layer each time you make a change and that triggers all the successive layers to be rebuilt.

By move moving the installation before you fix this issue.

You should also put all the installations on 1 line and clean the apt cache when you are done.

RUN apt-get install curl ca-certificates -y && \
  curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - && \
  apt-get install postgis postgresql-9.6-postgis-scripts -y && \
  rm -rf /var/cache/apt && \
  apt-get clean

Upvotes: 3

Andrew
Andrew

Reputation: 609

If I move the curl install stuff to before the parts that I'm changing, the cache will hit more often. My new file is

FROM postgres:9.6.10
RUN apt-get update
RUN apt-get install curl ca-certificates -y
RUN curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
RUN apt-get install postgis postgresql-9.6-postgis-scripts -y

ADD data/tsvs.tar.gz /standard_data
COPY postgres/*.sql /docker-entrypoint-initdb.d/
COPY postgres/subsetting.s* /docker-entrypoint-initdb.d/

h/t to Caleb H. for making think of this with his comment.

Upvotes: 1

Related Questions