Reputation: 117
My program call clone
and call /bin/sh
in the subprocess.
In the shell, I run cat /proc/$$/mountinfo
to see the propagation attribution.
If the flag is CLONE_NEWNS
, I got this:
# cat /proc/$$/mountinfo
194 193 8:1 / / rw,relatime shared:1 - ext4 /dev/sda1 rw,discard,errors=remount-ro
...
If combining CLONE_NEWNS
and CLONE_NEWUSER
(uncommenting flags |= CLONE_NEWUSER;
in the following source), I got this:
199 198 8:1 / / rw,relatime master:1 - ext4 /dev/sda1 rw,discard,errors=remount-ro
...
Why CLONE_NEWUSER
would make the difference? On my machine (Debian 9), it should always be MS_SHARED
since it's created from a MS_SHARED mounting point.
#define _GNU_SOURCE
#include <sched.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#define STACK_SIZE (1024 * 1024)
static char container_stack[STACK_SIZE];
char *const container_args[] = {"/bin/sh", NULL};
int container_main(void *arg) {
printf("Container - inside the container!\n");
printf("container pid is %d\n", getpid());
int status = execv(container_args[0], container_args);
if (status < 0) perror("execv");
printf("Something's wrong!\n");
return 0;
}
int main() {
printf("Parent [ %d ] - start a container!\n", getpid());
int flags = CLONE_NEWNS;
//flags |= CLONE_NEWUSER;
int container_pid = clone(container_main, container_stack + STACK_SIZE,
SIGCHLD | flags, NULL);
if (container_pid < 0) {
perror("clone");
return -1;
}
printf("Container pid is %d\n", container_pid);
waitpid(container_pid, NULL, 0);
printf("Parent - container stopped!\n");
return 0;
}
Upvotes: 1
Views: 115
Reputation: 48647
man 7 mount_namespaces
explains it. Relevant excerpts:
* Each mount namespace has an owner user namespace. As
explained above, when a new mount namespace is created, its
mount point list is initialized as a copy of the mount point
list of another mount namespace. If the new namespace and the
namespace from which the mount point list was copied are owned
by different user namespaces, then the new mount namespace is
considered less privileged.
* When creating a less privileged mount namespace, shared mounts
are reduced to slave mounts. (Shared and slave mounts are
discussed below.) This ensures that mappings performed in
less privileged mount namespaces will not propagate to more
privileged mount namespaces
shared:X
This mount point is shared in peer group X. Each peer
group has a unique ID that is automatically generated by
the kernel, and all mount points in the same peer group
will show the same ID. (These IDs are assigned starting
from the value 1, and may be recycled when a peer group
ceases to have any members.)
master:X
This mount is a slave to shared peer group X.
Upvotes: 2