Reputation: 3
The data in the database is intended to be surfaced by an API in another container. Previously, I have successfully loaded the database during run using this suggestion. However, my database is quite large (10gb) and ideally I would not have to load the database again each time I start a new container. I want the database to be loaded on build. To accomplish this, I tried the following for my Dockerfile:
FROM mongo:4.0.6-xenial
COPY dump /data/dump
RUN mongod --fork --logpath /var/log/mongod.log \
&& mongorestore /data/dump \
&& mongo --eval "db.getSiblingDB('db').createUser({user:'user',pwd:'pwd',roles:['readWrite']})" \
&& mongod --shutdown
I expected the database to be in the container when I ran this image, but it was not, nor does the user exist. However, the log file /var/log/mongod.log indicates that the database loaded successfully as far as I can tell. Why did this not work?
Upvotes: 0
Views: 1893
Reputation: 103965
The official mongo Docker image writes the database data in a docker volume.
At run time (thus in a docker container), keep in mind that files written to volumes do not end up written on the container file system. This is done to persist your data so that it survives container deletion, but more importantly in the context of database, for performance reasons. To have good I/O performances with disks, disk operations must be done on a volume, not on the container file system itself.
At build time (thus when creating a docker image), if you happen to have RUN
/ADD
/COPY
directives in your Dockerfile write files to a location which is already declared as a volume, those files will be discarded. However, if you write the files to a directory in your Dockerfile, and only after you declare that directory as a volume, then those the volume will keep those files unless you start your container specifying a volume with the docker run -v
option.
This means that in the case your own Dockerfile is built FROM mongo
, the /data
location is already declared as a volume. Writing files to that location is pointless.
Knowing how volumes works, you could copy the contents from the Dockerfile of the official mongo Docker image and insert a RUN
/ADD
/COPY
directive to write the files you want to the /data/db
location before the VOLUME /data/db /data/configdb
directive.
Assuming you have a tar archive named mongo-data-db.tar
with the contents of the /data/db
location from a mongo container having all the database and collections you want, you could use the following Dockerfile and copy-initial-data-entry-point.sh, you can build an image which will copy those data to the /data/db
location every time the container is started. This only make sense in a use case where such a container is used for a test suite which requiers the very same initial data everytime such a container is started as previous data are replaced with the inital data at each start.
Dockerfile:
FROM mongo
COPY ./mongo-data-db.tar /mongo-data-db.tar
COPY ./copy-initial-data-entry-point.sh /
RUN chmod +x /copy-initial-data-entry-point.sh
ENTRYPOINT [ "/copy-initial-data-entry-point.sh"]
CMD ["mongod"]
copy-initial-data-entry-point.sh:
#!/bin/bash
set -e
tar xf /mongo-data-db.tar -C /
exec /usr/local/bin/docker-entrypoint.sh "$@"
In order to extract the contents of a /data/db
from the volume of a mongo container named my-mongo-container
, proceed as follow:
docker stop my-mongo-container
docker run --rm --volumes-from my-mongo-container -v $(pwd):/out ubuntu tar cvf /out/mongo-data-db.tar
Note that this archive will be quite large as it contains the full contents of the mongo server data including indexes as described on the mongo documentation
Upvotes: 1