iquestionshard
iquestionshard

Reputation: 1054

In Kubernetes, how can I safely tear down Long Running Processes?

I have a system where some long running tasks are done by processing messages from a message queue. The actual tasks are doing some significant processing on large videos.

Here is the problem in the following steps:

  1. Process in pod takes message off queue, start processing video, this takes minutes.
  2. Developer makes change, releases, and a Kubernetes Deployment starts.
  3. During the deployment, the long running process gets killed and replaced by new node, which loses all work.

Is there a mechanism to work around this in Kubernetes? Some kind of check to ensure that the worker in the pod is in a state that it can be destroyed safely? Almost something like a destroyProbe (the opposite of a readinessProbe)

Upvotes: 1

Views: 1796

Answers (1)

acid_fuji
acid_fuji

Reputation: 6853

Calling preStop hook before container is terminated should help you perform grace shutdown. preStop hook is configured at container level and allows you to run a custom command before the SIGTERM will be sent (please note that the termination grace period countdown actually starts before invoking the preStop hook and not once the SIGTERM signal will be sent).

This hook is called immediately before a container is terminated due to an API request or management event such as liveness probe failure, preemption, resource contention and others. A call to the preStop hook fails if the container is already in terminated or completed state. It is blocking, meaning it is synchronous, so it must complete before the call to delete the container can be sent. No parameters are passed to the handler.

Setting also appropriate terminationGracePeriod also matters since Kubernetes' management of the Container blocks until the preStop handler completes, unless the Pod's grace period expires. This means that termination grace period countdown starts before invoking the preStop hook and not once the SIGTERM signal is sent.

Check lifecycle hooks and pod termination documents for more information.

Upvotes: 1

Related Questions