Dasarathi Swain
Dasarathi Swain

Reputation: 991

The node was low on resource: ephemeral-storage

All the pods of a node are on Evicted state due to "The node was low on resource: ephemeral-storage."

portal-59978bff4d-2qkgf                            0/1     Evicted   0          14m
release-mgmt-74995bc7dd-nzlgq                      0/1     Evicted   0          8m20s
service-orchestration-79f8dc7dc-kx6g4              0/1     Evicted   0          7m31s
test-mgmt-7f977567d6-zl7cc                         0/1     Evicted   0          8m17s

anyone knows the quick fix of it.

Upvotes: 65

Views: 163602

Answers (7)

Vinh Trieu
Vinh Trieu

Reputation: 1083

This issue happened due to of lacking of temporary storage while processing such as application process their jobs and store temporary, cache data.

To resolve this issue, you must dive into your pod, and check, when the process running which device location cost your available storage by command df -h, and observe the available capacity size. You can create a pvc (with hostpath, or other ways) which has larger size and mount into pod's directory which store their temporary data.

Upvotes: 5

Andre Miras
Andre Miras

Reputation: 3840

In my case the problem was the nodes were filling up with docker images. Some of them unused and never pruned and others way too big. To confirm it, you first have to ssh to the node and check if the disk is (nearly) full. For instance:

[root@node-name ~]# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p1   20G   15G  5.9G  71% /

It's possible to findout which image specifically occupies to most space and I recommend to do so. Check this excellent resource to see how to: https://rharshad.com/eks-troubleshooting-disk-pressure/

Knowing which image takes the most space and investigating its file system to know why can be useful to optimize image size, but that's a different topic.

If you can't add more storage to the node it's possible to clean it up with docker prune. But before we need to make sure no containers are running, so let’s drain the node first:

kubectl drain node-name

Note that the node will be cordoned after it’s drained, this means no containers will be scheduled to it. Back inside the node let’s prune the unused docker resources:

[root@node-name ~]# docker system prune --all
WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all images without at least one container associated to them
  - all build cache

Are you sure you want to continue? [y/N] y
Deleted Containers:
8333683571a2ceff47bf08cc254f8fa3809acacc7fb981be3c1c274e9465dd68
28bdc62425707127ac977d20fd3dc85374ffc54ccccf2b2f2098d9af9ca3c898
7315014bfd9207c5a1b8e76ef0f1567bb5e221de6fe0304f4728218abd7e1f3f
b0f5ecb854a9f4b41610d7ec5b556447600f57529e68ae2093d1d40df02ff214
9e24227321d5e151bc665c55bcd474c9d586857cbac3cad744aad2dc11729e5e
63ab1bf7ded78d4b77db22f9c1aaac6a55247c71ca55b51caa8492f2b16c4d69
...
Total reclaimed space: 4.529GB

Then check the storage space again:

[root@node-name ~]# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p1   20G  8.9G   12G  45% /

Now let’s put the node back to a ready state using the kubectl command from the host:

rancher kubectl uncordon node-name

Upvotes: 7

matyas
matyas

Reputation: 2796

My problem was that my pod was writing to a folder that was not defined in the volumeMounts of the deployments.

volumeMounts:
  - name: my-data-volume
    mountPath: "/the/path/thatImounted"

my pod wrote to a different path than "/the/path/thatImounted"

The solution in this case is to to either add the path that the pod writes to to as addittional mountPath or to fix the the wrong mountPath

Upvotes: 1

quoc9x
quoc9x

Reputation: 2191

If you don't set limits.ephemeral-storage, requests.ephemeral-storage, by default pods have permission to use all node's storage space.
So, you can set limits.ephemeral-storage, requests.ephemeral-storage

apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: images.my-company.example/app:v4
    resources:
      requests:
        ephemeral-storage: "2Gi"
      limits:
        ephemeral-storage: "4Gi"

Or, configure the Docker logging driver to limit the amount of stored logs (in the file /etc/docker/daemon.json, by default this file doesn't exist, you must create it):

{
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "2"
}
}

Upvotes: 3

kalyani chaudhari
kalyani chaudhari

Reputation: 7849

Please consider following factors:

  1. Application that you are deploying via Kubernetes should have limits and requests set for memory and CPU in manifest file.
  2. As per your application requirements you should have your nodes configured in Kubernetes cluster.
  3. Increase no of nodes if all of them are heavily used by apps.

Upvotes: -5

Hemal Ekanayake
Hemal Ekanayake

Reputation: 21

You can increase the size of the EBS volume which is attached and restart the EC2 instance to get that effect.

Upvotes: -3

Arghya Sadhu
Arghya Sadhu

Reputation: 44677

Pods that use emptyDir volumes without storage quotas will fill up this storage, where the following error is present:

eviction manager: attempting to reclaim ephemeral-storage

Set a quota limits.ephemeral-storage, requests.ephemeral-storage to limit this, as otherwise any container can write any amount of storage to its node filesystem.

A sample resource quota definition

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
spec:
  hard:
    pods: "4" 
    requests.cpu: "1" 
    requests.memory: 1Gi 
    requests.ephemeral-storage: 2Gi 
    limits.cpu: "2" 
    limits.memory: 2Gi 
    limits.ephemeral-storage: 4Gi

Another reason for this issue can be log files eating disk space. Check this question

Upvotes: 43

Related Questions