Reputation: 5245
I'm working on a compute framework that will run on Google Container Engine (Kubernetes).
The desired behavior, is that users will provide a container to be executed (this is the user payload, we are ok doing this as the users are few and trusted). The user container will be uploaded to the registry beforehand.
When the framework runs, it will launch a number of workers (each on a pod, listening to a celery queue) and a master node will load a bunch of arguments to pass to the workers (trough celery/rabbitmq).
When a worker runs, it will perform 3 things (for each work item):
SET UP
: The worker will copy files and configurations from google cloud storage and other places. The files will be placed in a pod's volume. EXECUTION
: The worker should download the user container from the registry and run it. I also want to capture stdout
and stderr
from the container's process, and if possible add a deadline (if the container hasn't completed execution within X seconds, halt). The user container will generate it's results as files in a volume directory.CLEAN UP and REPORTING
: The host pod will copy some artifacts generated by the user container back to google cloud. Other results will be reported to a proprietary application.I want the framework to be invisible to the users (because we don't want to share credentials with them and prevent them from having any task-managing logic).
Since the host is a container itself, haven't found a good way to achieve this (pull and run a container within a script running in another container).
Is this possible to achieve in Kubernetes? Is there any documentation or projects doing anything similar? and, Are there any pitfalls with this approach?
Thanks!
Upvotes: 0
Views: 262
Reputation: 5245
Ended up solving it as follows:
First, I created a job
defined as follows (snippet):
apiVersion: batch/v1
kind: Job
metadata:
name: item-001
spec:
template:
metadata:
name: item-xxx
spec:
containers:
- name: worker
image: gcr.io/<something>/worker
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-socket-mount
- mountPath: /workspace
name: workspace
volumes:
- name: docker-socket-mount
hostPath:
path: /var/run/docker.sock
- name: workspace
hostPath:
path: /home/workspace
There are 2 mounts, the first docker-socket-mount
mounts /var/run/docker.sock
into the container, so I can use Docker from inside, and the second, it mounts a volume that will be shared between the host and the guest container workspace
.
The worker
runs a script similar to this:
#!/usr/bin/env bash
IMAGE=gcr.io/some/guest/image
# ...
gsutil -m cp -r gs://some/files/I/need/* /workspace
# ...
export DOCKER_API_VERSION=1.23
gcloud docker -- pull ${IMAGE}
docker run -v /home/workspace:/workspace ${IMAGE}
# ...
Having the docker socket available, it's good enough to install the docker client and call it normally. The trick was to mount the guest image from /home/workspace
, as seen from the kubernetes node and not from the host image (/workspace
). The files downloaded to /workspace
are now also available on the guest container.
Finally, the Dockerfile
looks similar to this:
FROM ubuntu:14.04
# ...
# Install Docker
RUN curl -fsSL https://get.docker.com/ | sh
# Install Google Cloud SDK
ADD xxx.json /home/keys/xxx.json
RUN curl https://sdk.cloud.google.com > /tmp/gcloud.sh
RUN bash /tmp/gcloud.sh --disable-prompts --install-dir=/home/tools/
RUN /home/tools/google-cloud-sdk/bin/gcloud auth activate-service-account [email protected] --key-file=/home/keys/xxx.json
RUN /home/tools/google-cloud-sdk/bin/gcloud config set project my-project
# ...
Upvotes: 1