Reputation: 440
So we have around 100 tests, each test connect to a postgres instance and consume a database loaded with some data. The tests edits and change that data so we reload the postgres database for each test.
This takes really long time so I thought of using Docker for this as follows. I'm new to docker so this is the steps I'm using:
1) I would create one postgres container, load it with the test database that I want and make it ready and polished.
2) Use this command to save my container as tar
docker save -o postgres_testdatabase.tar postgres_testdatabase
3) For each test I load a new tar into an image
docker load -i postgres_testdatabase.tar
4) Run the container with the postgres instance
docker run -i -p 5432 postgres_testdatabase
5) The test runs and changes the data..
6) Destroy the container and load a fresh container with new fresh test database
7) Run the second test and so on..
My problem is that I found out that when I backup a container to a tar and load it and then run a new container I do not get my database, I basically get a fresh postgres installation with none of my databases.
What I'm doing wrong?
EDIT:
I tried one of the suggestion to commit my changes before I save my container to an image as follows:
I committed my updated container to a new image. Saved that Image to a tar file, deleted my existing container. Loaded the tar file and then run a new container from my saved image. I still don't see my databases.. I believe it has to do something with Volumes. How do I do this without volumes? how do I force all my data to be in the container so it get backed up with the image?
EDIT2 Warmoverflow suggested I use an sql file to load all my data while loading the image. This wont work in my case since the data is carefully being authored using another software (ArcGIS), plus the data has some complex blob fields geometries, so sql file to load the script wont work. He also suggested that I dont need to save the data as tar if im spawing containers in the same machine. Once Im satisified with my data and commit it to the image, i can load the image into a new container. Thanks for clarifying this. Still the problem is that how do I keep my database within my image so when I restore the image, the database comes with the container.
EDIT3
So I find a workaround inspired by warmoverflow suggestion, this should solve my problem. However, I'm still looking for a cleaner way to do this.
The solution is do the following:
use pg_dumpall to dump the entire postgres instance into a single file with this command. We can run this command from any postgres client, and we don't have to copy the dump file inside the container. I'm running this from Windows.
C:\Program Files\PostgreSQL\9.3\bin>pg_dumpall.exe -h 192.168.99.100 -p 5432 -U postgres > c:\Hussein\dump\pg_test_dump.dmp
You can now safely delete your container.
Call this command on your container postgres instance to load your dump
C:\Program Files\PostgreSQL\9.3\bin>psql -f c:\Hussein\dump\ pg_test_dump.dmp -h 192.168.99.100 -p 5432 -U postgres
Run the test, test will screw the data so we need to reload, we simply repeat the steps above.
I would still, really want the container image to have the database "in it" so when I run a container from an image, I get the database. Will be great if anyone could suggest a solution with that, will save me huge time.
Edit4 Finally Warmoverflow solved it! Answer below
Thanks
Upvotes: 2
Views: 3425
Reputation: 12107
docker save
is for images (saving images as tar file). What you need is docker commit
which commit container changes to an image, and then save it to tar. But if your database is the same for all tests, you should build a custom image using a Dockerfile, and then run your containers using the single image.
If your data is loaded using an sql
file, you can follow the instructions on "How to extend this image" section of the official postgres docker page https://hub.docker.com/_/postgres/. You can create a Dockerfile with the following content
FROM postgres
RUN mkdir -p /docker-entrypoint-initdb.d
ADD data.sql /docker-entrypoint-initdb.d/
Put your data.sql
file and Dockerfile in a new folder, and run docker build -t custom_postgres .
, which will build a customized image for you, and every time you run a new container with it, it will load the sql file on boot.
[Update]
Based on the new information from the question, the cause of the issue is that the official postgres
image defines a VOLUME
at the postgres data folder /var/lib/postgresql/data
. VOLUME
is used to persist data outside the container (when you use docker run -v
to mount a host folder to the container), and thus any data inside the VOLUME
are not saved when you commit the container itself. While this is normally a good idea, in this specific situation, we actually need data not be persistent, so that a fresh new container with the same data unmodified can be started every time.
The solution is to create your own version of the postgres image, with the VOLUME
removed.
VOLUME
line from Dockerfile
docker build -t mypostgres .
, which will build your own postgres image with the name mypostgres
.docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=123456 mypostgres
to start your container. The postgres db is available at postgres:[email protected]:5432
docker commit container_id_from_step_5 mypostgres_withdata
. This creates your own postgres image with data.docker rm -f container_id_from_step_5
docker run -d -p 5432:5432 mypostgres_withdata
to start a container, and remember to stop or remove the used container afterwards so that it won't occupy the 5432 port.Upvotes: 5