Reputation: 1873
Sometimes I got a bunch of jobs to launch, and each of them mounts a pvc. As our resource is limited, some pods fail to mount in less than one minute.
Unable to mount volumes for pod "package-job-120348968617328640-5gv7s_vname(b059856a-ecfa-11ea-a226-fa163e205547)": timeout expired waiting for volumes to attach or mount for pod "vname"/"package-job-120348968617328640-5gv7s". list of unmounted volumes=[tmp]. list of unattached volumes=[log tmp].
And it sure keeps retrying. But it never success (event age is like 44s (x11 over 23m)
). But if I delete this pod, this job will create a new pod and it will complete.
So why is this happening? Shouldn't pod retry mount automatically instead of needing manual intervention? And if this is not avoidable, is there a workaround that it will automatically delete pods in Init Phase more than 2 min?
It's actually the attaching script provided by my cloud provider in some of the nodes stucks (caused by a network problem). So If others run into these problem, maybe checking storage plugin that attaches disks is a good idea.
Upvotes: 0
Views: 1372
Reputation: 7080
I had same problem, when even volume attached to same node where pod is running.
I ssh into node and restarted kubelet
then it fixed the issue.
Upvotes: 0
Reputation: 128837
So why is this happening? Shouldn't pod retry mount automatically instead of needing manual intervention? And if this is not avoidable, is there a workaround that it will automatically delete pods in Init Phase more than 2 min?
There can be multiple reasons to this. Do you have any Events on the Pod if you do kubectl describe pod <podname>
? And do you reuse the PVC that another Pod used before?
I guess that you use a regional cluster, consisting of multiple datacenters (Availability Zones) and that your PVC is located in one AZ but your Pod is scheduled to run in a different AZ? In such situation, the Pod will never be able to mount the volume since it is located in another AZ.
Upvotes: 2