Reputation: 2303
I am trying create a bunch of pods, services and deployment using Kubernetes, but keep hitting the following errors when I run the kubectl describe
command.
for "POD" with RunContainerError: "runContainer: API error (500): Cannot start container bbdb58770a848733bf7130b1b230d809fcec3062b2b16748c5e4a8b12cc0533a: [8] System error: too many open files in system\n"
I have already terminated all pods and try restarting the machine, but it doesn't solve the issue. I am not an Linux expert, so I am just wondering how shall find all the open files and close them?
Upvotes: 4
Views: 14684
Reputation: 22088
If the problem returns then you might need to change the ulimit value.
You haven't specified if you're running on a cloud provider or locally with tools like kind/minikube.
If you need to change the ulimit value on all nodes in a clusteryou can run a privilaged Daemon that will change the ulimit value:
image: busybox
command: ["sh", "-c", "ulimit -n 10000"]
securityContext:
privileged: true
And then delete it.
If it's for specific node, you can SSH into it with:
kubectl debug node/mynode -it --image=busybox
And then try to run the ulimit
command.
If it's a Linux node and you got the permission error,
try first to raise the allowed limit in the /etc/limits.conf
file (or /etc/security/limits.conf
depends on your Linux distribution) - add this line:
* hard nofile 10000
Then logout and login again and then you can run:
ulimit -n 10000
If for some reason you can't run the ulimit
command try editing the docker configuration.
If it's just for a quick debug and you need to apply some changes to all containers in a node try to edit the /etc/docker/daemon.json
configuration file:
{
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Hard": 64000,
"Soft": 64000
}
For a more permenant change you can for example in EKS add this via the user data:
sudo sed -i "s|ExecStart=.*|ExecStart=/usr/bin/dockerd --default-ulimit memlock=83968000:83968000|g" /usr/lib/systemd/system/docker.service
sudo systemctl restart docker.service
Upvotes: 1
Reputation: 5642
You can confirm which process is hogging file descriptors by running:
lsof | awk '{print $2}' | sort | uniq -c | sort -n
That will give you a sorted list of open FD counts with the pid of the process. Then you can look up each process w/
ps -p <pid>
If the main hogs are docker/kubernetes, then I would recommend following along on the issue that caesarxuchao referenced.
Upvotes: 13