Reputation: 161
Requests that our app needs to serve exhibit great variance (unknown in advance) of processing latencies (from few seconds to hours).
We'd like to use kubernetes autoscaling capabilities but it is not clear how to deal with random pod termination policy during downscaling (as it comes at odds with our desire to not terminate long running requests being processed).
Wondering if anybody else has seen similar situations? what solutions did you come up with?
Upvotes: 1
Views: 1089
Reputation: 22874
One of the things you can do, is to build into your app support for termination handling and set a rather long termination grace period. You can find a nice explanation of this topic in https://pracucci.com/graceful-shutdown-of-kubernetes-pods.html
This does not completely prevent you from killing long term connections. To be honest, nothing will. Yet it does significantly limit the impact of events like scaling on this type of workloads.
Upvotes: 3