Why doesn't postgres official docker repo start db service at build time?

Question

Under the background of https://github.com/docker-library/postgres (github repo) and https://registry.hub.docker.com/_/postgres/ (docker hub)

It can be seen database is started by Entrypoint and CMD with bash script

/docker-entrypoint.sh

with

ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE 5432
CMD ["postgres"]

another script hook provided to change database is

/docker-entrypoint-initdb.d

which means the database starts (can be pqsl) only at runtime, when docker run command is typed in.

This causes a problem, we could not customize the database before it runs in build time, for example add extensions and populate db with data.

Of course, it could be done in run time. But it has the advantage to repeat the operation every time when the image is run.

So, what is the logic behind this design from docker or postgres perspective? How could I add extension and populate data in build time ?

Thomasleveil · Accepted Answer

If you were to customize (create, populate data) a database at build time, that would imply that the database data is written into the docker image filesystem itself (as one cannot mount a volume at build time).

The issue with that is that the docker image filesystem is a special one (AUFS or btrfs, etc) which isn't delivering good I/O performances for data intensive applications such as a database server.

As a consequence, you want to have your data written on a volume instead of on the docker container filesystem. As you don't know at build time what would be the volume used at run time, and as there is no mean anyway to mount volumes at build time, no one should create database at build time.

Furthermore, if you take a close look at the Dockerfile of the official PostgreSQL image, you will see that there is a VOLUME instruction that makes the path at which the data is written a volume. That means that the image is designed so that the data will never hit the docker container filesystem.

If you take a look at other Dockerfiles for other databases or data intensive applications, you will notice that they all operate in this manner. An other reason for that is that it is accepted as a good practice to make your docker containers immutable.

If you want to install additional modules to your image, it is fine as long as those do not depend on data that would be written on a volume, and as long as you make sure to declare a volume for any path they would write data on.

tl;dr

Application code/binary → docker image filesystem

Application data → docker volume

Why doesn't postgres official docker repo start db service at build time?

Answers (2)

tl;dr

Related Questions

Why doesn&#39;t postgres official docker repo start db service at build time?

Answers (2)

tl;dr

Related Questions

Why doesn't postgres official docker repo start db service at build time?