Reputation: 2692
TL;DR - The solution to the problem, thanks to Paul
If you have the problem described below, the easiest way to solve it is to execute the following command before running the Recipe to boot single node k8s:
sudo chcon -Rt svirt_sandbox_file_t /var/lib/kubelet
Original Problem Description
I am trying to put together a k8s environment based on this recipe > https://github.com/kubernetes/kubernetes/blob/release-1.1/docs/getting-started-guides/docker.md for the purpose of integration testing our code base which provisions containers in a k8s cluster. For copy/paste convenience, I am including all the commands in the recipe in the section 'Run the Recipe', below
I have a simple replication controller definition (reproduced below in 'Replication Controller Definition')
for a very standard image (nginx.) In this RC definition I attempt to mount a shared folder using 'emptyDir'.
For simplicity I only have one container in the rep controller definition (so there really is not much sharing going on.)
Now, when I provision this RC against our multi-node cluster via the command: 'kubectl create -f shared.folder.json'
I am able to log into the container 'nginx' and do the following:
touch /backup-folder/fooFile
The version info for our multi-node cluster is:
Server Version:
version.Info{
Major:"1",
Minor:"1+",
GitVersion:"v1.1.3-beta.0.308+71b088a96ee101-dirty",
GitCommit:"71b088a96ee101967fc06e1f95b1cade8f6e30f9", GitTreeState:"dirty"}
HOWEVER... when I bring up a single node k8s cluster using the steps in 'Run the Recipe', and provision against that cluster using the command 'kubectl create -f shared.folder.json' I then spawn a bash shell in the nginx container and attempt the same touch command as above, but in the single node case I get an error: touch: cannot touch '/backup-folder/fooo': Permission denied
In case it is useful here is the info I get from running mount -l in the two cases:
1) single node k8s
root@foo-hzxd6:/# mount -l | grep backup-folder
/dev/mapper/cl-root on /backup-folder type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
2) multi node k8s
root@foo-vcbc9:/# mount -l | grep backup-folder
/dev/vdb on /backup-folder type ext3 (rw,relatime,data=ordered)
Replication Controller Definition
shared.folder.json
{
"kind": "ReplicationController",
"apiVersion": "v1",
"metadata":{
"name":"foo",
"labels":{
"app":"foo",
"role":"foo"
}
},
"spec": {
"replicas": 1,
"selector": {
"name": "nginx"
},
"template": {
"metadata": {
"name": "nginx",
"labels": {
"name": "nginx"
}
},
"spec": {
"containers": [
{
"name": "nginx",
"image": "nginx",
"imagePullPolicy": "Always",
"ports": [
{
"containerPort": 8080
},
{
"containerPort": 8081
}
],
"command": ["sleep", "10000"],
"volumeMounts": [
{
"name": "shared-volume",
"mountPath": "/backup-folder"
}
]
}
],
"volumes": [
{
"name": "shared-volume",
"emptyDir": { }
}
]
}
}
}
}
Run the Recipe
docker run --net=host -d gcr.io/google_containers/etcd:2.0.12 /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data
docker run \
--volume=/:/rootfs:ro \
--volume=/sys:/sys:ro \
--volume=/dev:/dev \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
--volume=/var/run:/var/run:rw \
--net=host \
--pid=host \
--privileged=true \
-d \
gcr.io/google_containers/hyperkube:v1.0.1 \
/hyperkube kubelet --containerized --hostname-override="127.0.0.1" --address="0.0.0.0" --api-servers=http://localhost:8080 --config=/etc/kubernetes/manifests
docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v1.0.1 /hyperkube proxy --master=http://127.0.0.1:8080 --v=2
Side Note
( our version of k8s --which we use for the multi node case -- has some changes layered on top of 'stock' kubernetes, but i'm pretty sure none of these changes has to do with mounting folders).
Epilogue - More Detail On SELinux denial of access to shared folder
In answer to Paul's request for more detail, here we go:
First, flip enforcement back to 'yes' via: "setenforce 1"
Next, kill all docker containers, then relaunch k8s single node via 3 step recipe provided above.
Next, provision pod via "kubectl create -f shared.folder.json"
Next, throw a shell into container via: "kubectl exec -i -t foo-podxxx -c nginx -- bash "
In bash shell: "touch /backup-folder/blah"
RESULTS:
> sudo ausearch -ts recent -m AVC
----
time->Tue Jan 19 11:33:19 2016
type=SYSCALL msg=audit(1453231999.925:865015): arch=c000003e syscall=2 success=no exit=-13 a0=7ffd65fc1e45 a1=941 a2=1b6 a3=7ffd65fc09f0 items=0 ppid=25089 pid=25127 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts4 ses=4294967295 comm="touch" exe="/bin/touch" subj=system_u:system_r:svirt_lxc_net_t:s0:c202,c694 key=(null)
type=AVC msg=audit(1453231999.925:865015): avc: denied { create } for pid=25127 comm="touch" name="blah" scontext=system_u:system_r:svirt_lxc_net_t:s0:c202,c694 tcontext=system_u:object_r:docker_var_lib_t:s0 tclass=file
(backup-agent-scripts) /home/chris/dev/krylov/scripts >
Log of kubelet: https://dl.dropboxusercontent.com/u/9940067/kubelet.log
Upvotes: 0
Views: 1733
Reputation: 2692
Here is the complete single node k8s start up script that makes my problem go away. Thanks to Paul Morie for providing me w/ the solution (the magic first line in the script).
Update
Here is an update that Paul sent me on why chcon is used:
basically what it does is change the SELinux type for the volume directory that holds all the pod
volumes to svirt_sandbox_file_t, which is the context that most SELinux policies allow containers
(typically running with svirt_lxc_net_t) to use.
So, TLDR, that command makes the kube volume directory usable by docker containers (though of course containers
only have access to the volumes that are consumed in their pod and then mounted into the container).
My understanding of this is that normally Docker container run in isolation and can't see each others file systems, the chcon allows us to break this isolation, in a controlled fashion, such that only using volume mount directives is this sharing allowed to happen. This explanation seems relevant.
# magic selinux context set command is required. for details, see: http://stackoverflow.com/questions/34777111/cannot-create-a-shared-volume-mount-via-emptydir-on-single-node-kubernetes-on
#
sudo chcon -Rt svirt_sandbox_file_t /var/lib/kubelet
docker run --net=host -d gcr.io/google_containers/etcd:2.0.12 /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data
docker run \
--volume=/:/rootfs:ro \
--volume=/sys:/sys:ro \
--volume=/dev:/dev \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
--volume=/var/run:/var/run:rw \
--net=host \
--pid=host \
--privileged=true \
-d \
gcr.io/google_containers/hyperkube:v1.0.1 \
/hyperkube kubelet --containerized --hostname-override="127.0.0.1" --address="0.0.0.0" --api-servers=http://localhost:8080 --config=/etc/kubernetes/manifests
docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v1.0.1 /hyperkube proxy --master=http://127.0.0.1:8080 --v=2
sleep 20 # give everything time to launch
Upvotes: 1