Reputation: 121
I have a master node that has disk pressure and is spamming the log full with endless messages like these:
Mar 18 22:53:04 kubelet[7521]: W0318 22:53:04.413211 7521 eviction_manager.go:344] eviction manager: attempting to reclaim ephemeral-storage
Mar 18 22:53:04 kubelet[7521]: I0318 22:53:04.413235 7521 container_gc.go:85] attempting to delete unused containers
......................
Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429446 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod kube-controller-manager_kube-system(5308d5632ec7d3e588c56d9f0bca17c8) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429458 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod kube-apiserver_kube-system(9fdc5b37e61264bdf7e38864e765849a) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429464 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod kube-scheduler_kube-system(90280dfce8bf44f46a3e41b6c4a9f551) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429472 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod coredns-74ff55c5b-th722_kube-system(33744a13-8f71-4e36-8cfb-5955c5348a14) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429478 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod coredns-74ff55c5b-d45hd_kube-system(65a5684e-5013-4683-aa38-820114260d63) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429487 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod weave-net-wjs78_kube-system(f0f9a4e5-98a4-4df4-ac28-6bc1202ec06d) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429493 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod kube-proxy-8dvws_kube-system(c55198f4-38bc-4adf-8bd8-4a2ec2d8a46d) Mar 18 22:53:04 kubelet[7521]: E0318 22:53:04.429498 7521 eviction_manager.go:574] eviction manager: cannot evict a critical pod etcd_kube-system(e3f86cf1b5559dfe46a5167a548f8a4d) Mar 18 22:53:04 kubelet[7521]: I0318 22:53:04.429502 7521 eviction_manager.go:396] eviction manager: unable to evict any pods from the node
..............
This has been going on for months. I know that disk pressure is probably set the default value, but WHERE is it configured in the first place?
I do know about this: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/
It is probably this setting that can be set:
imagefs.available imagefs.available := node.stats.runtime.imagefs.available
(according to the link above)
But again, where? In etcd
? How can I set this for all nodes to a default?
It is true that there is less space available than the setting is set to, but this is the controlplane (there are no other pods on it) and not a productive system, it is for testing only and I can't see anything in the logs because kubernetes spams it full of garbage. Garbage because these messages make absolutely not sense: These pods are not supposed to be evicted ever, they are essential and they should not even be tried to evict.
My questions:
Upvotes: 2
Views: 7166
Reputation: 43
The kubelet is the kubernetes kernel and runs on each node. Similar to a linux kernel, it manages critical system functions like memory and disk allocation, thus disk pressure is configured here.
Aside from the obvious answer of delete some files, you can control the disk pressure thresholds in the following ways:
Quick and Dirty
This is the old way by command line:
kubelet --eviction-hard 'imagefs.available<5%,memory.available<10Mi,nodefs.available<3%'
Which returns a deprecation warning:
Flag --eviction-hard has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag
The right way
So we can check the config flag being passed at kubelet startup on most linux systems with:
sudo systemctl status kubelet
Find the config file by the '--config' flag on the startup command under 'CGroup':
● kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2024-08-12 19:33:13 CDT; 36min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 44420 (kubelet)
Tasks: 20
Memory: 54.8M
CGroup: /system.slice/kubelet.service
└─44420 /usr/local/bin/kubelet --v=2 --node-ip=0.0.0.0 --hostname-override=myhostname.com --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/etc/kubernetes/kubelet-config.yaml --kubeconfig=/etc/kubernetes/kubelet.conf --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --runtime-cgroups=/system.slice/containerd.service --root-dir=/opt/services/k8s/kubelet
Append the required changes to the end of the file with vim:
vi /etc/kubernetes/kubelet-config.yaml
...
...
eventRecordQPS: 5
shutdownGracePeriod: 60s
shutdownGracePeriodCriticalPods: 20s
evictionHard:
memory.available: "100Mi"
nodefs.available: "3%"
nodefs.inodesFree: "5%"
imagefs.available: "5%"
Restarting the kubelet service to apply the changes on this node only:
sudo systemctl restart kubelet
Your other questions are vague:
Upvotes: 1
Reputation: 54211
There's three ways to set Kubelet options. First is command line options like --eviction-hard
. Next is a config file. And more recent is dynamic configuration.
Of course the better answer here is to free up some disk space.
Upvotes: 2