Reputation: 1975
I have a job with restartPolicy: "Never"
which ends with "Error"
With Status : Error
this job should not restart.
However, a new pod is created again and again each time the previous job fails :
$ kubectl get pods
kafka-pvbqk 0/1 Error 0 2m19s
kafka-ttt95 0/1 Error 0 109s
kafka-7fqgl 0/1 Error 0 69s
kafka-rrmlk 0/1 PodInitializing 0 2s
I know first thing should be to fix the error but I also want to understand why the pod restarts and how to avoid that ?
Thanks
Upvotes: 12
Views: 6962
Reputation: 44687
That is correct behavior and not a bug. The restart policy you're pointing to is about Pod, not a job itself.
To fail a Job after some amount of retries set .spec.backoffLimit
to specify the number of retries before considering a Job as failed.
The back-off limit is set by default to 6. Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. The back-off count is reset when a Job's Pod is deleted or successful without any other Pods for the Job failing around that time.
Upvotes: 21