Reputation: 11377
A node of my k8s cluster has GC trying to remove images used by a container.
This behaviour seems strange to me.
Here the logs:
kubelet: I1218 12:44:19.925831 11177 image_gc_manager.go:334] [imageGCManager]: Removing image "sha256:99e59f495ffaa222bfeb67580213e8c28c1e885f1d245ab2bbe3b1b1ec3bd0b2" to free 746888 bytes
kubelet: E1218 12:44:19.928742 11177 remote_image.go:130] RemoveImage "sha256:99e59f495ffaa222bfeb67580213e8c28c1e885f1d245ab2bbe3b1b1ec3bd0b2" from image service failed: rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 99e59f495ffa (cannot be forced) - image is being used by running container 6f236a385a8e
kubelet: E1218 12:44:19.928793 11177 kuberuntime_image.go:126] Remove image "sha256:99e59f495ffaa222bfeb67580213e8c28c1e885f1d245ab2bbe3b1b1ec3bd0b2" failed: rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 99e59f495ffa (cannot be forced) - image is being used by running container 6f236a385a8e
kubelet: W1218 12:44:19.928821 11177 eviction_manager.go:435] eviction manager: unexpected error when attempting to reduce nodefs pressure: wanted to free 9223372036854775807 bytes, but freed 0 bytes space with errors in image deletion: rpc error: code = Unknown desc = Error response from daemon: conflict: unable to delete 99e59f495ffa (cannot be forced) - image is being used by running container 6f236a385a8e
Any suggestions? May a manual remove of docker images and stopped containers on a node cause such a problem?
Thank you in advance.
Upvotes: 1
Views: 9988
Reputation: 38034
What you've encountered is not the regular Kubernetes garbage collection that deleted orphaned API resource objects, but the kubelet's Image collection.
Whenever a node experiences Disk pressure, the Kubelet daemon will desperately try to reclaim disk space by deleting (supposedly) unused images. Reading the source code shows that the Kubelet sorts the images to remove by the time since they have last been used for creating a Pod -- if all images are in use, the Kubelet will try to delete them anyways and fail (which is probably what happened to you).
You can use the Kubelet's --minimum-image-ttl-duration
flag to specify a minimum age that an image needs to have before the Kubelet will ever try to remove it (although this will not prevent the Kubelet from trying to remove used images altogether). Alternatively, see if you can provision your nodes with more disk space for images (or build smaller images).
Upvotes: 4
Reputation: 18403
As I understand, Kubelet has a garbage collector and the purpose of it to remove unnecessary k8s objects for utilising resources.
If the object does not belong to any owner it means its orphaned. There is a pattern in Kubernetes which is known as ownership in kubernetes.
For Instance, If you apply the deployment object then it will create a replicaSet object, further ResplicaSet will create pods objects.
So Ownership flow
Deployment <== RepicaSet <=== Pod
Now if you delete Deployment object which means ReplicaSet does not have an owner then Garbage collector will try to remove ReplicaSet and now Pods do not have owner, therefore, GC will try to remove pods.
There is a field called ownerReferences which describe the relationship among all of these Objects such as Deployment, ReplicaSet, Pods etc.
There are 3 ways to delete objects in Kubernetes.
Solutions to your issues
It seems to me that your pod (containers) is orphaned, therefore, GC is making sure that it is removed from the cluster.
If you want to check ownerRererences status :
kubectl get pod $PODNAME -o yaml
In the metadata sections, there will be an adequate information.
I have attached references for further research.
Upvotes: 0