yixing
yixing

Reputation: 464

How docker implements filesystem level isolation?

I'm trying to find out how docker implements filesystem level isolation, I've done some reading about how container isolate filesystem like https://www.oreilly.com/library/view/container-security/9781492056690/ch04.html, https://www.youtube.com/watch?v=8fi7uSYlOdc, both of them tell me that container typically use chroot to achieve filesystem level isolation. When I try to use the strace tool to detect the syscalls used by docker run, I don't find the chroot syscall, this confuses me.

The entire strace command I use is strace -f -o result.txt docker run --rm -it ubuntu bash, I find syscall used by docker run in result.txt, I search the whole file but cann't find chroot syscall

My questions are:

  1. whether this problem is caused by strace, strace can't trace syscall in container so chroot is omitted, if it is, what else methods can I use to detect chroot syscall?
  2. If chroot really isn't being used in docker run, What mechanism does docker use to achieve filesystem isolation?

Upvotes: 0

Views: 500

Answers (1)

KamilCuk
KamilCuk

Reputation: 141698

tool to detect the syscalls used by docker run

Docker is a client server application. All docker cli does is it sends requests to docker deamon. Docker deamon does all the work. You have to stop docker deamon and run dockerd inside strace and then you will see chroot and pivot_root system calls.

Here I run a docker-in-docker for testing, because I do not want to touch my host docker deamon, do run a strace dockerd command and then run a alpine image. I filter with grep because I am lazy. You can see pivot_root("/var/lib/docker/overlay2..." syscall made by dockerd.

# docker run -ti --rm --privileged --entrypoint= docker:dind sh -l
   // inside docker:dind container:
0592bac28743:/# apk add strace 
...
0592bac28743:/# strace dockerd 2> >(grep pivot_root) &
   // wait a bit for dockerd to start
   // and finally:
0592bac28743:/# docker run -ti --rm alpine echo 1
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
7264a8db6415: Extracting [>                                                  ]  65.54kB/3.402MB
[pid    28] mkdirat(AT_FDCWD, "/var/lib/docker/overlay2/b744b0e22645db69ca63c6a68ed671d54ead252088fd74b2867b01efd88adad9/diff/.pivot_root27393485", 0700) = 0
[pid    28] pivot_root("/var/lib/docker/overlay2/b744b0e22645db69ca63c6a68ed671d54ead252088fd74b2867b01efd88adad9/diff", "/var/lib/docker/overlay2/b744b0e22645db69ca63c6a68ed671d54ead252088fd74b2867b01efd88adad9/diff/.pivot_root27393485") = 0
[pid    28] mount("", "/.pivot_root27393485", 0xc001025bb3, MS_REC|MS_PRIVATE, NULL) = 0
[pid    28] umount2("/.pivot_root27393485", MNT_DETACH) = 0
[pid    28] unlinkat(AT_FDCWD, "/.pivot_root27393485", 0) = -1 EISDIR (Is a directory)
7264a8db6415: Extracting [==================================================>]  3.402MB/3.402MB
[pid    28] newfstatat(AT_FDCWD, "/sbin/pivot_root", 0xc00072bbd8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
[pid    28] symlinkat("/bin/busybox", AT_FDCWD, "/sbin/pivot_root") = 0
[pid    28] fchownat(AT_FDCWD, "/sbin/pivot_root", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
[pid    28] utimensat(AT_FDCWD, "/sbin/pivot_root", [{tv_sec=1691413782, tv_nsec=0} /* 2023-08-07T13:09:42+0000 */, {tv_sec=1691413782, tv_nsec=07264a8db6415: Pull complete 
Digest: sha256:7144f7bab3d4c2648d7e59409f15ec52a18006a128c733fcff20d3a4a54ba44a
Status: Downloaded newer image for alpine:latest
[pid   223] pivot_root(".", "." <unfinished ...>
                                                [pid   223] <... pivot_root resumed>)   = 0
                                                                                           1
0592bac28743:/# 

Upvotes: 2

Related Questions