Jono
Jono

Reputation: 3633

Build postgres docker container with initial schema

I'm looking to build dockerfiles that represent company databases that already exist. Similarly, I'd like create a docker file that starts by restoring a psql dump.

I have my psql_dump.sql in the . directory.

FROM postgres
ADD . /init_data
run "createdb" "--template=template0" "my_database"
run  "psql" "-d" "my_database"  --command="create role my_admin superuser"
run  "psql" "my_database" "<" "init_data/psql_dump.sql"

I thought this would be good enough to do it. I'd like to avoid solutions that use a .sh script. Like this solution.

I use template0 since the psql documentation says you need the same users created that were in the original database, and you need to create the database with template0 before you restore.

However, it gives me an error:

createdb: could not connect to database template1: could not connect to server: No such file or directory
        Is the server running locally and accepting

I'm also using docker compose for the overall application, if solving this problem in docker-compose is better, I'd be happy to use the base psql image and use docker compose to do this.

Upvotes: 58

Views: 86729

Answers (3)

thehale
thehale

Reputation: 1736

@Thomasleveil's answer will re-create the database schema at runtime, which is fine for most cases.

If you want to recreate the database schema at buildtime (i.e. if your schema initialization is really slow) you can invoke the stock docker_entrypoint.sh from within your Dockerfile.

However, since the docker_entrypoint.sh is designed to start a long-running database server, you have to add an extra script to exit the process after database initialization but before booting the long-running server.

Dockerfile (with build time database initialization)

# STAGE 1 - Equivalent to @Thomasleveil
FROM postgres AS runtime_init
ENV POSTGRES_DB my_database
COPY 1-psql_dump.sql /docker-entrypoint-initdb.d/

# STAGE 2 - Initialize the database during the build
FROM runtime_init AS buildtime_init_builder
RUN echo "exit 0" > /docker-entrypoint-initdb.d/100-exit_before_boot.sh
ENV PGDATA=/pgdata
RUN docker-entrypoint.sh postgres

# STAGE 3 - Copy the initialized db to a new image to reduce size.
FROM postgres AS buildtime_init
ENV PGDATA=/pgdata
COPY --chown=postgres:postgres --from=buildtime_init_builder /pgdata /pgdata

Important Notes

  • The stock postgres image will run initialization scripts in alphabetical order, so ensure that your database restoration scripts appear earlier than the exit_before_boot.sh script created in the Dockerfile.
    • This is demonstrated by the 1 and 100 prefixes shown above. Modify them to your liking.
  • Database updates to a running instance of this image will not be persisted across reboots since the PGDATA path where the database files are stored no longer maps to a volume mounted from the host machine.

Further Reading

Upvotes: 8

yishaiz
yishaiz

Reputation: 2583

As said in the comments, @Thomasleveil answer is great and simple if your schema recreation is fast. But in my case it's slow, and I wanted to use docker volumes, so here is what I did

  1. First use docker image as in @Thomasleveil answer to create a container with postgres with all the schema initialization

Dockerfile:

FROM postgres
WORKDIR /docker-entrypoint-initdb.d
ADD psql_dump.sql /docker-entrypoint-initdb.d
EXPOSE 5432
  1. then run it and create new local dir which contains the postgres data after its populated from the “psql_dump.sql” file: docker cp mypg:/var/lib/postgresql/data ./postgres-data

  2. Copy the data to a temp data folder, and start a new postgres docker-compose container whose volume is at the new temp data folder:

startPostgres.sh:

rm -r ./temp-postgres-data/data
mkdir -p ./temp-postgres-data/data
cp -r ./postgres-data/data ./temp-postgres-data/
docker-compose -p mini-postgres-project up

and the docker-compose.yml file is:

version: '3'
services:
  postgres:
    container_name: mini-postgres
    image: postgres:9.5
    ports:
    - "5432:5432"
    volumes:
      - ./temp-postgres-data/data:/var/lib/postgresql/data

Now you can run steps #1 and #2 on a new machine or if your psql_dump.sql changes. And each time you want a new clean (but already initialized) db, you can only run startPostgres.sh from step #3. And it still uses docker volumes.

Upvotes: 9

Thomasleveil
Thomasleveil

Reputation: 103915

According to the usage guide for the official PostreSQL Docker image, all you need is:

Dockerfile

FROM postgres
ENV POSTGRES_DB my_database
COPY psql_dump.sql /docker-entrypoint-initdb.d/

The POSTGRES_DB environment variable will instruct the container to create a my_database schema on first run.

And any .sql file found in the /docker-entrypoint-initdb.d/ of the container will be executed.

If you want to execute .sh scripts, you can also provide them in the /docker-entrypoint-initdb.d/ directory.

Upvotes: 80

Related Questions