Reputation: 11552
How do people deal with persistent storage for your Docker containers?
I am currently using this approach: build the image, e.g. for PostgreSQL, and then start the container with
docker run --volumes-from c0dbc34fd631 -d app_name/postgres
IMHO, that has the drawback, that I must not ever (by accident) delete container "c0dbc34fd631".
Another idea would be to mount host volumes "-v" into the container, however, the userid within the container does not necessarily match the userid from the host, and then permissions might be messed up.
Note: Instead of --volumes-from 'cryptic_id'
you can also use --volumes-from my-data-container
where my-data-container
is a name you assigned to a data-only container, e.g. docker run --name my-data-container ...
(see the accepted answer)
Upvotes: 1072
Views: 320953
Reputation: 7120
To preserve or storing database data make sure your docker-compose.yml will look like if you want to use Dockerfile
version: '3.1'
services:
php:
build:
context: .
dockerfile: Dockerfile
ports:
- 80:80
volumes:
- ./src:/var/www/html/
db:
image: mysql
command: --default-authentication-plugin=mysql_native_password
restart: always
environment:
MYSQL_ROOT_PASSWORD: example
volumes:
- mysql-data:/var/lib/mysql
adminer:
image: adminer
restart: always
ports:
- 8080:8080
volumes:
mysql-data:
your docker-compose.yml will looks like if you want to use your image instead of Dockerfile
version: '3.1'
services:
php:
image: php:7.4-apache
ports:
- 80:80
volumes:
- ./src:/var/www/html/
db:
image: mysql
command: --default-authentication-plugin=mysql_native_password
restart: always
environment:
MYSQL_ROOT_PASSWORD: example
volumes:
- mysql-data:/var/lib/mysql
adminer:
image: adminer
restart: always
ports:
- 8080:8080
volumes:
if you want to store or preserve data of mysql then must remember to add two lines in your docker-compose.yml
volumes:
- mysql-data:/var/lib/mysql
and
volumes:
mysql-data:
after that use this command
docker-compose up -d
now your data will persistent and will not be deleted even after using this command
docker-compose down
extra:- but if you want to delete all data then you will use
docker-compose down -v
plus you can check your database data list by using this command
docker volume ls
DRIVER VOLUME NAME
local 35c819179d883cf8a4355ae2ce391844fcaa534cb71dc9a3fd5c6a4ed862b0d4
local 133db2cc48919575fc35457d104cb126b1e7eb3792b8e69249c1cfd20826aac4
local 483d7b8fe09d9e96b483295c6e7e4a9d58443b2321e0862818159ba8cf0e1d39
local 725aa19ad0e864688788576c5f46e1f62dfc8cdf154f243d68fa186da04bc5ec
local de265ce8fc271fc0ae49850650f9d3bf0492b6f58162698c26fce35694e6231c
local phphelloworld_mysql-data
Upvotes: 1
Reputation: 6324
When using Docker Compose, simply attach a named volume, for example:
version: '2'
services:
db:
image: mysql:5.6
volumes:
- db_data:/var/lib/mysql:rw
environment:
MYSQL_ROOT_PASSWORD: root
volumes:
db_data:
Upvotes: 17
Reputation: 17589
There are several levels of managing persistent data, depending on your needs:
-v host-path:container-path
to persist container directory data to a host directory.--volumes-from
to mount that data into your application container.Upvotes: 10
Reputation: 2978
Use Persistent Volume Claim (PVC) from Kubernetes, which is a Docker container management and scheduling tool:
The advantages of using Kubernetes for this purpose are that:
Upvotes: 0
Reputation: 9086
As of Docker Compose 1.6, there is now improved support for data volumes in Docker Compose. The following compose file will create a data image which will persist between restarts (or even removal) of parent containers:
Here is the blog announcement: Compose 1.6: New Compose file for defining networks and volumes
Here's an example compose file:
version: "2"
services:
db:
restart: on-failure:10
image: postgres:9.4
volumes:
- "db-data:/var/lib/postgresql/data"
web:
restart: on-failure:10
build: .
command: gunicorn mypythonapp.wsgi:application -b :8000 --reload
volumes:
- .:/code
ports:
- "8000:8000"
links:
- db
volumes:
db-data:
As far as I can understand: This will create a data volume container (db_data
) which will persist between restarts.
If you run: docker volume ls
you should see your volume listed:
local mypthonapp_db-data
...
You can get some more details about the data volume:
docker volume inspect mypthonapp_db-data
[
{
"Name": "mypthonapp_db-data",
"Driver": "local",
"Mountpoint": "/mnt/sda1/var/lib/docker/volumes/mypthonapp_db-data/_data"
}
]
Some testing:
# Start the containers
docker-compose up -d
# .. input some data into the database
docker-compose run --rm web python manage.py migrate
docker-compose run --rm web python manage.py createsuperuser
...
# Stop and remove the containers:
docker-compose stop
docker-compose rm -f
# Start it back up again
docker-compose up -d
# Verify the data is still there
...
(it is)
# Stop and remove with the -v (volumes) tag:
docker-compose stop
docker=compose rm -f -v
# Up again ..
docker-compose up -d
# Check the data is still there:
...
(it is).
Notes:
You can also specify various drivers in the volumes
block. For example, You could specify the Flocker driver for db_data:
volumes:
db-data:
driver: flocker
Disclaimer: This approach is promising, and I'm using it successfully in a development environment. I would be apprehensive to use this in production just yet!
Upvotes: 40
Reputation: 447
I'm just using a predefined directory on the host to persist data for PostgreSQL. Also, this way it is possible to easily migrate existing PostgreSQL installations to Docker containers: https://crondev.com/persistent-postgresql-inside-docker/
Upvotes: 2
Reputation: 9950
In case it is not clear from update 5 of the selected answer, as of Docker 1.9, you can create volumes that can exist without being associated with a specific container, thus making the "data-only container" pattern obsolete.
See Data-only containers obsolete with docker 1.9.0? #17798.
I think the Docker maintainers realized the data-only container pattern was a bit of a design smell and decided to make volumes a separate entity that can exist without an associated container.
Upvotes: 18
Reputation: 2339
My solution is to get use of the new docker cp
, which is now able to copy data out from containers, not matter if it's running or not and share a host volume to the exact same location where the database application is creating its database files inside the container. This double solution works without a data-only container, straight from the original database container.
So my systemd init script is taking the job of backuping the database into an archive on the host. I placed a timestamp in the filename to never rewrite a file.
It's doing it on the ExecStartPre:
ExecStartPre=-/usr/bin/docker cp lanti-debian-mariadb:/var/lib/mysql /home/core/sql
ExecStartPre=-/bin/bash -c '/usr/bin/tar -zcvf /home/core/sql/sqlbackup_$$(date +%%Y-%%m-%%d_%%H-%%M-%%S)_ExecStartPre.tar.gz /home/core/sql/mysql --remove-files'
And it is doing the same thing on ExecStopPost too:
ExecStopPost=-/usr/bin/docker cp lanti-debian-mariadb:/var/lib/mysql /home/core/sql
ExecStopPost=-/bin/bash -c 'tar -zcvf /home/core/sql/sqlbackup_$$(date +%%Y-%%m-%%d_%%H-%%M-%%S)_ExecStopPost.tar.gz /home/core/sql/mysql --remove-files'
Plus I exposed a folder from the host as a volume to the exact same location where the database is stored:
mariadb:
build: ./mariadb
volumes:
- $HOME/server/mysql/:/var/lib/mysql/:rw
It works great on my VM (I building a LEMP stack for myself): https://github.com/DJviolin/LEMP
But I just don't know if is it a "bulletproof" solution when your life depends on it actually (for example, webshop with transactions in any possible miliseconds)?
At 20 min 20 secs from this official Docker keynote video, the presenter does the same thing with the database:
"For the database we have a volume, so we can make sure that, as the database goes up and down, we don't loose data, when the database container stopped."
Upvotes: 0
Reputation: 91
If you want to move your volumes around you should also look at Flocker.
From the README:
Flocker is a data volume manager and multi-host Docker cluster management tool. With it you can control your data using the same tools you use for your stateless applications by harnessing the power of ZFS on Linux.
This means that you can run your databases, queues and key-value stores in Docker and move them around as easily as the rest of your application.
Upvotes: 8
Reputation: 19665
@tommasop's answer is good, and explains some of the mechanics of using data-only containers. But as someone who initially thought that data containers were silly when one could just bind mount a volume to the host (as suggested by several other answers), but now realizes that in fact data-only containers are pretty neat, I can suggest my own blog post on this topic: Why Docker Data Containers (Volumes!) are Good
See also: my answer to the question "What is the (best) way to manage permissions for Docker shared volumes?" for an example of how to use data containers to avoid problems like permissions and uid/gid mapping with the host.
To address one of the OP's original concerns: that the data container must not be deleted. Even if the data container is deleted, the data itself will not be lost as long as any container has a reference to that volume i.e. any container that mounted the volume via --volumes-from
. So unless all the related containers are stopped and deleted (one could consider this the equivalent of an accidental rm -fr /
) the data is safe. You can always recreate the data container by doing --volumes-from
any container that has a reference to that volume.
As always, make backups though!
UPDATE: Docker now has volumes that can be managed independently of containers, which further makes this easier to manage.
Upvotes: 10
Reputation: 1070
In Docker release v1.0, binding a mount of a file or directory on the host machine can be done by the given command:
$ docker run -v /host:/container ...
The above volume could be used as a persistent storage on the host running Docker.
Upvotes: 83
Reputation: 2589
It depends on your scenario (this isn't really suitable for a production environment), but here is one way:
Creating a MySQL Docker Container
This gist of it is to use a directory on your host for data persistence.
Upvotes: 6
Reputation: 4931
While this is still a part of Docker that needs some work, you should put the volume in the Dockerfile with the VOLUME instruction so you don't need to copy the volumes from another container.
That will make your containers less inter-dependent and you don't have to worry about the deletion of one container affecting another.
Upvotes: 15
Reputation: 18765
Use volume API
docker volume create --name hello
docker run -d -v hello:/container/path/for/volume container_image my_command
This means that the data-only container pattern must be abandoned in favour of the new volumes.
Actually the volume API is only a better way to achieve what was the data-container pattern.
If you create a container with a -v volume_name:/container/fs/path
Docker will automatically create a named volume for you that can:
docker volume ls
docker volume inspect volume_name
--volumes-from
connectionThe new volume API adds a useful command that lets you identify dangling volumes:
docker volume ls -f dangling=true
And then remove it through its name:
docker volume rm <volume name>
As @mpugach underlines in the comments, you can get rid of all the dangling volumes with a nice one-liner:
docker volume rm $(docker volume ls -f dangling=true -q)
# Or using 1.13.x
docker volume prune
The approach that seems to work best for production is to use a data only container.
The data only container is run on a barebones image and actually does nothing except exposing a data volume.
Then you can run any other container to have access to the data container volumes:
docker run --volumes-from data-container some-other-container command-to-execute
In this blog post there is a good description of the so-called container as volume pattern which clarifies the main point of having data only containers.
Docker documentation has now the DEFINITIVE description of the container as volume/s pattern.
Following is the backup/restore procedure for Docker 1.8.x and below.
BACKUP:
sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data
RESTORE:
# Create a new data container
$ sudo docker run -v /data -name DATA2 busybox true
# untar the backup files into the new container᾿s data volume
$ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar
data/
data/sven.txt
# Compare to the original container
$ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data
sven.txt
Here is a nice article from the excellent Brian Goff explaining why it is good to use the same image for a container and a data container.
Upvotes: 1022
Reputation: 71
I recently wrote about a potential solution and an application demonstrating the technique. I find it to be pretty efficient during development and in production. Hope it helps or sparks some ideas.
Repo: https://github.com/LevInteractive/docker-nodejs-example
Article: http://lev-interactive.com/2015/03/30/docker-load-balanced-mongodb-persistence/
Upvotes: 3