Reputation: 20890
I have a pod with some terrible, buggy software in it. One reason Kubernetes is great is that it'll just restart the software when it crashes, which is awesome.
Kubernetes was designed for good software, not terrible software, so it does an exponential backoff while restarting pods. This means I have to wait five minutes between crashes before my pods are restarted.
Is there any way to cap the kubernetes backoff strategy? I'd like to change it to not wait longer than thirty seconds before starting up the pod again.
Upvotes: 16
Views: 8281
Reputation: 101
As already in the reply from @yu-ju hong established, you can't change the hardcoded values concerning the backOff timeouts. But if you really want to allow the service inside the pod to restart as much as it wants and are not interested in the telemetry that k8s provides around that, you could also wrap the application in a shell script that restarts it in a while loop.
This is not an answer to the question but could be a pragmatic "solution".
But you need to take into consideration that you can't distinguish between those expected crashed and other that are perhaps not expected.
Upvotes: 2
Reputation: 7287
Unfortunately, the max back off time for container restarts is not tunable for the node reliability (i.e., too many container restarts can overwhelm the node). If you absolutely want to change it in your cluster, you will need to modify the max backoff time in the code, compile your own kubelet binary, and distribute it onto your nodes.
Upvotes: 13