Reputation: 103
We would like to pack as many pods into each nodes in our cluster as much as possible do decrease the amount of nodes we have on some of our environments. I saw https://github.com/kubernetes-sigs/descheduler HighNodeUtilization strategy which seems to fit the bill for what we need. However, it seems the cluster needs to have the scoring strategy MostAllocated to work with this.
I believe that the kube-scheduler in EKS in inaccessible to be configured. How do I then configure the MostAllocated scoring strategy?
Better yet, how do I configure this automated packing of pods in as little nodes as possible in a cluster without the use of Descheduler?
Tried deploying the descheduler as is without the MostAllocated scoring strategy configured. Obviously did not provide the results expected.
Many of my digging online led to having to create a custom-scheduler, but I have found little/unclear resources to be able to do so.
Upvotes: 3
Views: 1539
Reputation: 178
Eks does not provide the ability to override the default scheduler configuration, which means that actually configuring the default-scheduler
profile with the MostAllocated
scoring strategy is not an option. However, you may run your own scheduler alongside the default scheduler, and this one may be configured how you like. Once you create a custom scheduler, you can override that scheduler's configuration with the MostAllocated
scoring strategy and then instruct your workloads to use that scheduler.
In order to run multiple schedulers, you have to set up several Kubernetes Objects. These objects are documented in the guide linked above:
The deployment will use the standard kube-scheduler
image provided by Google, unless you'd like to create your own. I wouldn't recommend it.
In addition, ensure that your version of the kube-scheduler
is compatible with the version of the configuration objects that you use to configure the scheduler profile. v1beta2
is safe for v1.22.x
-> v1.24.x
but only v1beta3
or v1
is safe for v.1.25+
.
For example, here's a working version of a deployment manifest and config map that are used to create a custom scheduler compatible with k8s
v.1.22.x
. Note you'll still have to create the other objects for this to work:
apiVersion: apps/v1
kind: Deployment
metadata:
name: custom-scheduler
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
name: custom-scheduler
template:
metadata:
labels:
component: scheduler
name: custom-scheduler
tier: control-plane
spec:
containers:
- command:
- /usr/local/bin/kube-scheduler
- --config=/etc/kubernetes/custom-scheduler/custom-scheduler-config.yaml
env: []
image: registry.k8s.io/kube-scheduler:v1.22.16
imagePullPolicy: IfNotPresent
livenessProbe:
httpGet:
path: /healthz
port: 10259
scheme: HTTPS
name: custom-scheduler
readinessProbe:
httpGet:
path: /healthz
port: 10259
scheme: HTTPS
volumeMounts:
- mountPath: /etc/kubernetes/custom-scheduler
name: custom-scheduler-config
serviceAccountName: custom-scheduler
volumes:
- configMap:
name: custom-scheduler-config
name: custom-scheduler-config
apiVersion: v1
kind: ConfigMap
data:
custom-scheduler-config.yaml: |
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
leaderElection:
leaderElect: false
profiles:
- pluginConfig:
- args:
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: NodeResourcesFitArgs
scoringStrategy:
resources:
- name: cpu
weight: 1
- name: memory
weight: 1
type: MostAllocated
name: NodeResourcesFit
plugins:
score:
enabled:
- name: NodeResourcesFit
weight: 1
schedulerName: custom-scheduler
metadata:
name: custom-scheduler-config
namespace: kube-system
Upvotes: 1