Kubernetes: achieving uneven/weighted pod distribution in EKS

Question

We plan to use AWS EKS to run a stateless application.

There is a goal to achieve optimal budget by using spot instances and prefer them to on-demand ones.

Per AWS recommendations, we plan to have two Managed Node Groups: one with on-demand instances, and one with spot instances, plus Cluster Autoscaler to adjust groups size.

Now, the problem to solve is achieving two somewhat conflicting requirements:

Prefer spot nodes to on-demand, e.g. run 90% of pods on spot instances and 10% on on-demand ones
But still, ensure that some pods always do run within on-demand group, so even in case of massive spot instance drain, there still will be some pods that can process requests

After some research I found following possible approaches to solving it:

Approach A: Using preferredDuringSchedulingIgnoredDuringExecution with weights based on Node Group capacity type label. E.g. one preferredDuringSchedulingIgnoredDuringExecution rule with weight 90 would prefer nodes with capacity type spot, and other rule with weight 1 would prefer on-demand ones, e.g.:

preferredDuringSchedulingIgnoredDuringExecution:
  - weight: 90
    preference:
      matchExpressions:
        - key: eks.amazonaws.com/capacityType
          operator: In
          values:
            - spot
  - weight: 1
    preference:
      matchExpressions:
        - key: eks.amazonaws.com/capacityType
          operator: NotIn
          values:
            - spot

The downside is that — as I understand — you are not guaranteed to have pods running on least preferred group, as those are just (added) weights, not some sort of exact distribution.

Other approach, which in theory could be combined with one above (?) is also using topologySpreadConstraints, e.g.:

spec:
  topologySpreadConstraints:
  - maxSkew: 20
    topologyKey: eks.amazonaws.com/capacityType
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        foo: bar

Which would distribute pods across nodes with different capacity types, while allowing a skew of, say, 20 pods between them, and probably should (?) be combined with preferredDuringSchedulingIgnoredDuringExecution to achieve the desired effect.

How feasible is the approach above? Are those the right tools to achieve the goals? I would very much appreciate any advice on the case!

Kubernetes: achieving uneven/weighted pod distribution in EKS

Answers (1)

Related Questions