Reputation: 1290
I'm evaluating Karpenter (https://karpenter.sh/) and I wanted to know if there's a way to vertically scale down a large node with few pods. The only scaling actions seem to be triggered by either unschedulable pods or empty nodes.
Scenario: I scheduled 5 pods and the scheduler gave me one c5d.2xlarge
instance, and that resulted in a 65% utilization (not bad). I killed 3 pods and utilization dropped as expected to 25%. I waited for a few hours to see if an optimization process would kick in but .. nothing (over 20 hours). The feature is not well documented, in fact the only reference of it is in this independent article: https://blog.sivamuthukumar.com/karpenter-scaling-nodes-seamlessly-in-aws-eks
How does it work?
- Observes the pod resource requests of unscheduled pods
- Direct provision of Just-in-time capacity of the node. (Groupless Node Autoscaling)
- Terminating nodes if outdated
- Reallocating the pods in nodes for better resource utilization
Am I missing something? Is there a way to do this, using Karpenter or another solution? TIA
Upvotes: 0
Views: 1211
Reputation: 3479
As of karpenter 1.0, it supports replacing node with cheaper one if the replacement has enough resource. You may refer to https://karpenter.sh/docs/concepts/disruption/#consolidation for more details.
However, be reminded if you want to enable that for spot instance, you would need to enable the feature flag SpotToSpotConsolidation
.
Upvotes: 0
Reputation: 1290
So there's a feature request on Karpenter's Github project addressing this specific issue: https://github.com/aws/karpenter/issues/1091. I'll update this answer once a solution is available.
The workaround suggested by the project team, was to set a short TTL on the nodes (like 1 day), forcing Karpenter to evaluate optimization daily.
Upvotes: 1