fighter
fighter

Reputation: 337

How to check the directory structure of a docker tar archive?

The case is some users use docker save to get image archives and they send them to me. I would want to know the directory structure of that archive. For example if there is /var , /root and /home/somefiles directories in an image? If I don't use docker load xxx.tar and docker run, how to check the directory structure of the docker tar archive? Because I need to add some files to specified path in the image.

Upvotes: 4

Views: 7525

Answers (4)

Océane
Océane

Reputation: 41

I used what @lewo explained.

tar -xvf your-docker-image-as-a-tar-file
mkdir layers
cd layers
tar -xvf ../fdb124a038a8622f9a7028709f88ssbf695df7db9j9760d28e56bd5cde29e3e8/layer.tar 

I repeated the last line as many times as the number of layers, and in the end I got this

user@vm:~/docker-image-filesystem/layer$ ls
bin  dev  etc  home  lib  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

Upvotes: 2

Konstantin Vustin
Konstantin Vustin

Reputation: 7618

It really depends on Docker storage driver. For the latest recommended overlay2 it could be done like this:

For example I have a running ubuntu container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
2bfe549061f0        ubuntu              "/bin/bash"         6 seconds ago       Up 5 seconds                            keen_lewin

Next you can inspect your container for searching its data:

$ docker inspect 2bfe549061f0
...
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28-init/diff:/var/lib/docker/overlay2/32515dae6c1711b05d726b1cd4856f7df3a24a5beb36b412eec7713e1965ce7a/diff:/var/lib/docker/overlay2/b74326fdacfa1e713b567d286e76077f12f357d92a7c5418a850db5b78fb9d2b/diff:/var/lib/docker/overlay2/b14f98b9aefba75ba8d36005833eb4a9ec9aea84358756e753f7cd480d2c7c1a/diff:/var/lib/docker/overlay2/c3432c55caac2d52d2715624bece08daa309e8df0ba5207399b369d09bd6b360/diff:/var/lib/docker/overlay2/b81022c8b19d25f3bd88329ddade8dab337809fc90429c691df5e314bc773320/diff",
                "MergedDir": "/var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28/merged",
                "UpperDir": "/var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28/diff",
                "WorkDir": "/var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28/work"
...

Long story short: the "clear" image will be stored at the last path in LowerDir key. In my case it is the path:

$ sudo ls -l /var/lib/docker/overlay2/b81022c8b19d25f3bd88329ddade8dab337809fc90429c691df5e314bc773320/diff 
    total 76
    drwxr-xr-x  2 root root 4096 мая 26 06:45 bin
    drwxr-xr-x  2 root root 4096 апр 24 14:34 boot
    drwxr-xr-x  4 root root 4096 мая 26 06:44 dev
    drwxr-xr-x 29 root root 4096 мая 26 06:45 etc
    drwxr-xr-x  2 root root 4096 апр 24 14:34 home
    drwxr-xr-x  8 root root 4096 мая 26 06:44 lib
    drwxr-xr-x  2 root root 4096 мая 26 06:44 lib64
    drwxr-xr-x  2 root root 4096 мая 26 06:44 media
    drwxr-xr-x  2 root root 4096 мая 26 06:44 mnt
    drwxr-xr-x  2 root root 4096 мая 26 06:44 opt
    drwxr-xr-x  2 root root 4096 апр 24 14:34 proc
    drwx------  2 root root 4096 мая 26 06:45 root
    drwxr-xr-x  4 root root 4096 мая 26 06:44 run
    drwxr-xr-x  2 root root 4096 мая 26 06:45 sbin
    drwxr-xr-x  2 root root 4096 мая 26 06:44 srv
    drwxr-xr-x  2 root root 4096 апр 24 14:34 sys
    drwxrwxrwt  2 root root 4096 мая 26 06:45 tmp
    drwxr-xr-x 10 root root 4096 мая 26 06:44 usr
    drwxr-xr-x 11 root root 4096 мая 26 06:45 var

# Enusre it is Ubuntu 18.04
$ sudo cat /var/lib/docker/overlay2/b81022c8b19d25f3bd88329ddade8dab337809fc90429c691df5e314bc773320/diff/etc/os-release
NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

If I create something within a container, e.g.

root@2bfe549061f0:/# echo 123 > check_file.txt

Changes could be found in the UpperDir:

$ sudo ls -l /var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28/diff
total 4
-rw-r--r-- 1 root root 4 окт 23 15:25 check_file.txt

$ sudo cat /var/lib/docker/overlay2/39d10569b3d2543b9517c008e2244e1abde19aa3ec482ba0854cbdd332441e28/diff/check_file.txt
123

The similar things could be done without any running containers via

docker image inspect <IMAGE>

More info:

https://docs.docker.com/storage/storagedriver/overlayfs-driver/#image-and-container-layers-on-disk

Upvotes: 0

David Maze
David Maze

Reputation: 159733

The docker save/docker load format is lightly documented; I've run across very few tools that actually manipulate it in any form.

If you just want to add something to the image contents, then the best way is to docker load it and then use a Dockerfile as normal

FROM image:from-the-tar-file
COPY a_file.txt /

Otherwise, the tar file contains a directory per layer in the image, and each of those directories contains a layer.tar file with the actual layer contents. You'd have to write your own tool to inspect this, but you could use something like the Python tarfile library to look inside the file without fully expanding it.

Here's a quick Python script that checks if any layer in a docker save tar file contains a shell:

#!/usr/bin/env python3

import sys
import tarfile

def main():
    with tarfile.open(sys.argv[1]) as outer:
        layers_tars = [n for n in outer.getnames() if n.endswith('/layer.tar')]
        for layer_tar in layers_tars:
            layer_file = outer.extractfile(layer_tar)
            if layer_file is None:
                continue
            with tarfile.open(fileobj=layer_file) as layer:
                if 'bin/sh' in layer.getnames():
                    print(layer_tar)

if __name__ == '__main__':
    main()

Modifying and reassembling the save file is tricky; I've never done it successfully. There are a couple of JSON files that don't have a well-specified format in the documentation, and it's not 100% clear what happens when layers aren't fully additive (a layer removes a file that was in a lower layer).

Upvotes: 1

lewo
lewo

Reputation: 11

Basically, the Docker image archive contains all layers which are also tar file. I don't think you can inpect easily nesteed tar file with the tar tools: you would first need to untar the Docker archive and then tar -tf all layers.

Maybe, we could use skopeo which can "copy" a docker archive to a dir:

skopeo copy docker-archive://image.tgz dir:///tmp/image

Upvotes: 0

Related Questions