Reputation: 20147
I have a Docker container that contains my Postgres database. It's using the official Postgres image which has a CMD entry that starts the server on the main thread.
I want to populate the database by running RUN psql –U postgres postgres < /dump/dump.sql
before it starts listening to queries.
I don't understand how this is possible with Docker. If I place the RUN
command after CMD, it will of course never be reached because Docker has finished reading the Dockerfile. But if I place it before the CMD
, it will run before psql even exists as a process.
How can I prepopulate a Postgres database in Docker?
Upvotes: 75
Views: 96294
Reputation: 5565
There is yet another option available that utilises Flocker:
Flocker is a container data volume manager that is designed to allow databases like PostgreSQL to easily run in containers in production. When running a database in production, you have to think about things like recovering from host failure. Flocker provides tools for managing data volumes across a cluster of machines like you have in a production environment. For example, as a Postgres container is scheduled between hosts in response to server failure, Flocker can automatically move its associated data volume between hosts at the same time. This means that when your Postgres container starts up on a new host, it has its data. This operation can be accomplished manually using the Flocker API or CLI, or automatically by a container orchestration tool that Flocker is integrates with, for example Docker Swarm, Kubernetes or Mesos.
Upvotes: 2
Reputation: 6238
After a lot of fighting, I have found a solution ;-)
For me was very useful a comment posted here: https://registry.hub.docker.com/_/postgres/ from "justfalter"
Anyway, I have done in this way:
# Dockerfile
FROM postgres:9.4
RUN mkdir -p /tmp/psql_data/
COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/
db/structure.sql
is a sql dump, useful to initialize the first tablespace.
Then, the init_docker_postgres.sh
#!/bin/bash
# this script is run when the docker container is built
# it imports the base database structure and create the database for the tests
DATABASE_NAME="db_name"
DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"
echo "*** CREATING DATABASE ***"
# create default database
gosu postgres postgres --single <<EOSQL
CREATE DATABASE "$DATABASE_NAME";
GRANT ALL PRIVILEGES ON DATABASE "$DATABASE_NAME" TO postgres;
EOSQL
# clean sql_dump - because I want to have a one-line command
# remove indentation
sed "s/^[ \t]*//" -i "$DB_DUMP_LOCATION"
# remove comments
sed '/^--/ d' -i "$DB_DUMP_LOCATION"
# remove new lines
sed ':a;N;$!ba;s/\n/ /g' -i "$DB_DUMP_LOCATION"
# remove other spaces
sed 's/ */ /g' -i "$DB_DUMP_LOCATION"
# remove firsts line spaces
sed 's/^ *//' -i "$DB_DUMP_LOCATION"
# append new line at the end (suggested by @Nicola Ferraro)
sed -e '$a\' -i "$DB_DUMP_LOCATION"
# import sql_dump
gosu postgres postgres --single "$DATABASE_NAME" < "$DB_DUMP_LOCATION";
echo "*** DATABASE CREATED! ***"
So finally:
# no postgres is running
[myserver]# psql -h 127.0.0.1 -U postgres
psql: could not connect to server: Connection refused
Is the server running on host "127.0.0.1" and accepting
TCP/IP connections on port 5432?
[myserver]# docker build -t custom_psql .
[myserver]# docker run -d --name custom_psql_running -p 5432:5432 custom_psql
[myserver]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ce4212697372 custom_psql:latest "/docker-entrypoint. 9 minutes ago Up 9 minutes 0.0.0.0:5432->5432/tcp custom_psql_running
[myserver]# psql -h 127.0.0.1 -U postgres
psql (9.2.10, server 9.4.1)
WARNING: psql version 9.2, server version 9.4.
Some psql features might not work.
Type "help" for help.
postgres=#
# postgres is now initialized with the dump
Hope it helps!
Upvotes: 61
Reputation: 3201
I Followed the same solution which @damoiser , The only situation which was different was I wanted to import all dump data.
Please follow the solution below.(I have not done any kind of checks)
Dockerfile
FROM postgres:9.5
RUN mkdir -p /tmp/psql_data/
COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/
then the init_docker_postgres.sh
script
#!/bin/bash
DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"
echo "*** CREATING DATABASE ***"
psql -U postgres < "$DB_DUMP_LOCATION";
echo "*** DATABASE CREATED! ***"
and then you can build your image as
docker build -t abhije***/postgres-data .
docker run -d abhije***/postgres-data
Upvotes: 2
Reputation: 1269
For those who want to initialize a PostgreSQL DB with millions of records during the first run.
You can do simple sql dump and copy the dump.sql
file into /docker-entrypoint-initdb.d/
. The problem is speed. My dump.sql
script is about 17MB (small DB - 10 tables with 100k rows in only one of them) and the initialization takes over a minute (!). That is unacceptable for local development / unit test, etc.
The solution is to make a binary PostgreSQL dump and use shell scripts initialization support. Then the same DB is initialized in about 500ms instead of 1 minute.
1. Create the dump.pgdata
binary dump of a DB named "my-db"
directly from within a container or your local DB
pg_dump -U postgres --format custom my-db > "dump.pgdata"
Or from host from running container (postgres-container)
docker exec postgres-container pg_dump -U postgres --format custom my-db > "dump.pgdata"
2. Create a Docker image with a given dump and initialization script
$ tree
.
├── Dockerfile
└── docker-entrypoint-initdb.d
├── 01-restore.sh
├── 02-small-updates.sql
└── dump.pgdata
$ cat Dockerfile
FROM postgres:11
COPY ./docker-entrypoint-initdb.d/ /docker-entrypoint-initdb.d/
$ cat docker-entrypoint-initdb.d/01-restore.sh
#!/bin/bash
file="/docker-entrypoint-initdb.d/dump.pgdata"
dbname=my-db
echo "Restoring DB using $file"
pg_restore -U postgres --dbname=$dbname --verbose --single-transaction < "$file" || exit 1
$ cat docker-entrypoint-initdb.d/02-small-updates.sql
-- some updates on your DB, for example for next application version
-- this file will be executed on DB during next release
UPDATE ... ;
3. Build an image and run it
$ docker build -t db-test-img .
$ docker run -it --rm --name db-test db-test-img
Upvotes: 49
Reputation: 882
My goal was to have an image that contains the database - i. e. saving the time to rebuild it everytime I do docker run
oder docker-compose up
.
We would just have to manage to get the line exec "$@"
out of docker-entrypoint.sh
. So I added into my Dockerfile
:
#Copy my ssql scripts into the image to /docker-entrypoint-initdb.d:
COPY ./init_db /docker-entrypoint-initdb.d
#init db
RUN grep -v 'exec "$@"' /usr/local/bin/docker-entrypoint.sh > /tmp/docker-entrypoint-without-serverstart.sh && \
chmod a+x /tmp/docker-entrypoint-without-serverstart.sh && \
/tmp/docker-entrypoint-without-serverstart.sh postgres && \
rm -rf /docker-entrypoint-initdb.d/* /tmp/docker-entrypoint-without-serverstart.sh
Upvotes: 0
Reputation: 88
My solution is inspired by Alex Dguez's answer which unfortunately doesn't work for me because:
RUN /docker-entrypoint.sh --help
never ran through for me, which always complained with The command '/bin/sh -c /docker-entrypoint.sh -' returned a non-zero code: 1
/docker-entrypoint-initdb.d
dirThe following answer is originally from my reply in another post: https://stackoverflow.com/a/59303962/4440427. It should be noted that the solution is for restoring from a binary dump instead of from a plain SQL as asked by the OP. But it can be modified slightly to adapt to the plain SQL case
Dockerfile:
FROM postgres:9.6.16-alpine
LABEL maintainer="[email protected]"
LABEL org="Cobrainer GmbH"
ARG PG_POSTGRES_PWD=postgres
ARG DBUSER=someuser
ARG DBUSER_PWD=P@ssw0rd
ARG DBNAME=sampledb
ARG DB_DUMP_FILE=example.pg
ENV POSTGRES_DB launchpad
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD ${PG_POSTGRES_PWD}
ENV PGDATA /pgdata
COPY wait-for-pg-isready.sh /tmp/wait-for-pg-isready.sh
COPY ${DB_DUMP_FILE} /tmp/pgdump.pg
RUN set -e && \
nohup bash -c "docker-entrypoint.sh postgres &" && \
/tmp/wait-for-pg-isready.sh && \
psql -U postgres -c "CREATE USER ${DBUSER} WITH SUPERUSER CREATEDB CREATEROLE ENCRYPTED PASSWORD '${DBUSER_PWD}';" && \
psql -U ${DBUSER} -d ${POSTGRES_DB} -c "CREATE DATABASE ${DBNAME} TEMPLATE template0;" && \
pg_restore -v --no-owner --role=${DBUSER} --exit-on-error -U ${DBUSER} -d ${DBNAME} /tmp/pgdump.pg && \
psql -U postgres -c "ALTER USER ${DBUSER} WITH NOSUPERUSER;" && \
rm -rf /tmp/pgdump.pg
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
CMD pg_isready -U postgres -d launchpad
where the wait-for-pg-isready.sh
is:
#!/bin/bash
set -e
get_non_lo_ip() {
local _ip _non_lo_ip _line _nl=$'\n'
while IFS=$': \t' read -a _line ;do
[ -z "${_line%inet}" ] &&
_ip=${_line[${#_line[1]}>4?1:2]} &&
[ "${_ip#127.0.0.1}" ] && _non_lo_ip=$_ip
done< <(LANG=C /sbin/ifconfig)
printf ${1+-v} $1 "%s${_nl:0:$[${#1}>0?0:1]}" $_non_lo_ip
}
get_non_lo_ip NON_LO_IP
until pg_isready -h $NON_LO_IP -U "postgres" -d "launchpad"; do
>&2 echo "Postgres is not ready - sleeping..."
sleep 4
done
>&2 echo "Postgres is up - you can execute commands now"
The above scripts together with a more detailed README are available at https://github.com/cobrainer/pg-docker-with-restored-db
Upvotes: 1
Reputation: 3
We for E2E test in which we need a database with structure and data already saved in the Docker image we have done the following:
Dockerfile:
FROM postgres:9.4.24-alpine
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD postgres
ENV PGDATA /pgdata
COPY database.backup /tmp/
COPY database_restore.sh /docker-entrypoint-initdb.d/
RUN /docker-entrypoint.sh --help
RUN rm -rf /docker-entrypoint-initdb.d/database_restore.sh
RUN rm -rf /tmp/database.backup
database_restore.sh:
#!/bin/sh
set -e
pg_restore -C -d postgres /tmp/database.backup
To create the image:
docker build .
To start the container:
docker run --name docker-postgres -d -p 5432:5432 <Id-docker-image>
This does not restore the database every time the container is booted. The structure and data of the database is already contained in the created Docker image.
We have based on this article, but eliminating the multistage: Creating Fast, Lightweight Testing Databases in Docker
Edit: With version 9.4-alpine does not work now because it does not run the database_restore.sh scrips. Use version 9.4.24-alpine
Upvotes: 0
Reputation: 544
Alternatively, you can just mount a volume to /docker-entrypoint-initdb.d/ that contains all your DDL scripts. You can put in *.sh, *.sql, or *.sql.gz files and it will take care of executing those on start-up.
e.g. (assuming you have your scripts in /tmp/my_scripts)
docker run -v /tmp/my_scripts:/docker-entrypoint-initdb.d postgres
Upvotes: 31
Reputation: 145
I was able to load the data in by pre-pending the run command in the docker file with /etc/init.d/postgresql. My docker file has the following line which is working for me:
RUN /etc/init.d/postgresql start && /usr/bin/psql -a < /tmp/dump.sql
Upvotes: 0