Yogesh Jilhawar
Yogesh Jilhawar

Reputation: 6323

how `docker cp` command works

The command docker cp is used to copy files from host machine to container and vice versa. This command works even if container is in stop state or exited. Docker uses layered approach for storing images and when we run container by using this image, it creates one more writable layer above it which takes care of all the changes done inside the container. Once we exits from the container this writable layer gone. Here, I'm not able to find out, where docker stores the data of that container which is available for docker cp command even after container's exit . I searched /var/lib/docker directory, but no luck. I am using centos7.2 with devicemapper storage driver for docker. does anyone has any idea?

Upvotes: 3

Views: 2133

Answers (3)

SB Praveen
SB Praveen

Reputation: 97

When a docker container is created it comes up with a file system(image's read-only file system + file system present in the writable layer of the container). Docker uses UFS mechanism for its file systems.
Union Mount/Union File Systems(UFS)
UFS allows multiple file systems to be overlaid, appearing to the user as a singe file system. Folders may contain files from multiple file systems, but if two files have the exact same path, the last mounted file will hide any previous files. Docker supports several different UFS implementations, including AUFS, Overlay, devicemapper, BTRFS, and ZFS. Which implementation is used, is system dependent and can be checked by running docker info where it is listed under “Storage Driver.”
Copy files to a stopped container
When the files are copied to the container they get stored to the file system in the container layer. In the host machine the file system present in the container layer can be found in the following path /var/lib/docker/overlay2/<id>/diff (for Ubuntu OS, with overlay as the UFS. To get this path use the following command docker inspect --format '{{.GraphDriver.Data.MergedDir}}' container-id)
When the docker container is stopped(not deleted) only the execution of the processes within the container is stopped, but the container layer (including the file system) remains intact.
Thus the files that are copied to a stopped container, gets stored in the container layer path present in host machine(/var/lib/docker/overlay2/<id>/diff) and when the container is started again, the container layer is recreated from the /var/lib/docker/overlay2/<id>/diff path and thus the copied files are available for the container.
Credits : mohan08p + Don Kirkby answers & Orielly's Using docker

Upvotes: 0

Don Kirkby
Don Kirkby

Reputation: 56230

The docker container is not deleted when it exits, unless you use the --rm option. Instead, you can use docker cp to copy files to and from the container after it exits:

$ docker run --name run1 alpine sh -c "date > /tmp/test.txt"
$ docker cp run1:/tmp/test.txt test.txt
$ cat test.txt
Fri Oct  6 19:23:09 UTC 2017

You asked where those files are stored, so we need to do some digging. All of the docker data files seem to be under /var/lib/docker, so I looked around and found something under the overlay2 folder.

$ sudo ls -lt /var/lib/docker/100000.100000/overlay2
total 180
drwx------ 5 100000 100000 4096 Oct  6 12:23 bda6a77ed8fad218f738f32a149d4f9f9e46b103675b8b8f990f48e3339fd315
drwx------ 2 100000 100000 4096 Oct  6 12:23 l
drwx------ 5 100000 100000 4096 Oct  6 12:23 bda6a77ed8fad218f738f32a149d4f9f9e46b103675b8b8f990f48e3339fd315-init
drwx------ 3 100000 100000 4096 Oct  6 10:30 9c2aa6553beac112794143531e7760add2c94733898ae0674c5da30c3feb9451
...

There are many more folders there, but the most recent one probably has what we're looking for. The diff folder seems to hold all the files that changed between the image and the container. If I look in there, I find the file I created.

$ sudo ls -l /var/lib/docker/100000.100000/overlay2/bda6a77ed8fad218f738f32a149d4f9f9e46b103675b8b8f990f48e3339fd315/diff
total 4
drwxrwxrwt 2 100000 100000 4096 Oct  6 12:23 tmp
$ sudo ls -l /var/lib/docker/100000.100000/overlay2/bda6a77ed8fad218f738f32a149d4f9f9e46b103675b8b8f990f48e3339fd315/diff/tmp
total 4
-rw-r--r-- 1 100000 100000 29 Oct  6 12:23 test.txt
$ sudo cat /var/lib/docker/100000.100000/overlay2/bda6a77ed8fad218f738f32a149d4f9f9e46b103675b8b8f990f48e3339fd315/diff/tmp/test.txt
Fri Oct  6 19:23:09 UTC 2017
$ 

docker cp just does all the digging for me. Much easier. I'm using docker 17.09.0-ce, and I wouldn't be surprised if this internal structure changes from version to version.

Upvotes: 1

mohan08p
mohan08p

Reputation: 5382

First thing is, containers comes with 3 things

1) cgroups 2) namespaces 3) file systems

Each container spawn from an image is composed of all of these 3 things which means every container has their own file systems.

The command docker cp is used to copy files from host machine to container and vice versa. This command works even if container is in stop state or exited. Docker uses layered approach for storing images and when we run container by using this image, it creates one more writable layer above it which takes care of all the changes done inside the container. Agreed.

We called that writable layer as container layer. Once you exit from the container this writable layer gone which means you have not commit the changes or nothing has been newly written into this writable layer. If you have done the commit you will get the new docker image with this additional writable layer with newly created image. When a container is deleted, any data written to the container that is not stored in a data volume is deleted along with the container.

docker cp command works even when the container state is exited which means it is does store somewhere on the host machine. Now, another point this is not a data volume which is a directory or file system that we can mount directly into a container. So, the data inside the container is managed by the storage driver. In CentOS its devicemapper which stores images and layers contains in the thinpool, and expose them to containers by mounting them under sub-directories of /var/lib/docker/devicemapper/. The /var/lib/devicemapper/mnt/ directory contains a mount point for each image and container layer that exists. Image layer mount points are empty, but a container’s mount point shows the container’s filesystem as it appears from within the container. It uses snapshot mechanism where each image layer is the snapshot of the layer below it.

Data resides at the block level of parent image of the container where we request the cp command. As container is a snapshot of the image, it doesn't have the block, but it has a pointer to the block on the nearest parent image where it does exist, and it reads the respective block from there and copy it which will be available for our docker cp command. And, if the container is exited the request will directly go to the block of respective parent image and from where the files available for copying.

Hope you understand the terminology. Now, you can test it by experimenting with the layers and containers. All the Best.

Upvotes: 4

Related Questions