Reputation: 3059
What happens when Kubernetes liveness-probe
returns false?
Does Kubernetes restart that pod immediately?
Upvotes: 4
Views: 8721
Reputation: 3214
First, please note that livenessProbe
concerns containers in the pod, not the pod itself. So if you have multiple containers in one pod, only the affected container will be restarted.
It's worth noting, that there is parameter failureThreshold
, which is set by default to 3. So, after 3 failed probes a container will be restarted:
failureThreshold
: When a probe fails, Kubernetes will tryfailureThreshold
times before giving up. Giving up in case of liveness probe means restarting the container. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.
Ok, we have information that a container is restarted after 3 failed probes - but what does it mean to restart?
I found a good article about how Kubernetes terminates a pods - Kubernetes best practices: terminating with grace. Seems for container restart caused by liveness probe it's similar - I will share my experience below.
So basically when container is being terminated by liveness probe steps are:
PreStop
hook, it will be executedSo... if an app in your container is catching the SIGTERM signal properly, then the container will shut-down and will be started again. Typically it's happening pretty fast (as I tested for the NGINX image) - almost immediately.
Situation is different when SIGTERM is not supported by your application. It means after terminationGracePeriodSeconds
period the SIGKILL signal is sent, it means the container will be forcibly removed.
Example below (modified example from this doc) + I set failureThreshold: 1
I have following pod definition:
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: nginx
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
periodSeconds: 10
failureThreshold: 1
Of course there is no /tmp/healthy
file, so livenessProbe will fail. The NGINX image is properly catching the SIGTERM signal, so the container will be restarted almost immediately (for every failed probe). Let's check it:
user@shell:~/liveness-test-short $ kubectl get pods
NAME READY STATUS RESTARTS AGE
liveness-exec 0/1 CrashLoopBackOff 3 36s
So after ~30 sec the container is already restarted a few times and it's status is CrashLoopBackOff as expected. I created the same pod without livenessProbe and I measured the time need to shutdown it:
user@shell:~/liveness-test-short $ time kubectl delete pod liveness-exec
pod "liveness-exec" deleted
real 0m1.474s
So it's pretty fast.
The similar example but I added sleep 3000
command:
...
image: nginx
args:
- /bin/sh
- -c
- sleep 3000
...
Let's apply it and check...
user@shell:~/liveness-test-short $ kubectl get pods
NAME READY STATUS RESTARTS AGE
liveness-exec 1/1 Running 5 3m37s
So after ~4 min there are only 5 restarts. Why? Because we need to wait for full terminationGracePeriodSeconds
period (default is 30 seconds) for every restart. Let's measure time needed to shutdown:
user@shell:~/liveness-test-short $ time kubectl delete pod liveness-exec
pod "liveness-exec" deleted
real 0m42.418s
It's much longer.
To sum up:
What happens when Kubernetes liveness-probe return false? Does Kubernetes restart that pod immediately?
The short answer is: by default no. Why?
failureThreshold
times. By default it is 3 times - so after 3 failed probes.failureThreshold
and terminationGracePeriodSeconds
period parameters, so the container will be restarted immediately after every failed probe.Upvotes: 8