LoganHenderson
LoganHenderson

Reputation: 1442

Volume is already exclusively attached to one node and can't be attached to another

I have a pretty simple Kubernetes pod. I want a stateful set and want the following process:

  1. I want to have an initcontainer download and uncompress a tarball from s3 into a volume mounted to the initcontainer
  2. I want to mount that volume to my main container to be used

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app
  namespace: test
  labels:
    name: app
spec:
  serviceName: app
  replicas: 1
  selector:
    matchLabels:
      app: app
  template:
    metadata:
      labels:
        app: app
    spec:
      initContainers:
      - name: preparing
        image: alpine:3.8
        imagePullPolicy: IfNotPresent
        command:
          - "sh"
          - "-c"
          - |
            echo "Downloading data"
            wget https://s3.amazonaws.com/.........
            tar -xvzf xxxx-........ -C /root/
        volumeMounts:
        - name: node-volume
          mountPath: /root/data/

      containers:
      - name: main-container
        image: ecr.us-west-2.amazonaws.com/image/:latest
        imagePullPolicy: Always

        volumeMounts:
        - name: node-volume
          mountPath: /root/data/

  volumeClaimTemplates:
  - metadata:
      name: node-volume
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: gp2-b
      resources:
        requests:
          storage: 80Gi

I continue to get the following error:

At first I run this and I can see the logs flowing of my tarball being downloaded by the initcontainer. About half way done it terminates and gives me the following error:

Multi-Attach error for volume "pvc-faedc8" Volume is 
already exclusively attached to one node and can't be 
attached to another

Upvotes: 22

Views: 69199

Answers (6)

Abdul Jabbar
Abdul Jabbar

Reputation: 412

It is because PVC attached to some dangling node which might not be available anymore.

You do not need to delete all PVC or VolumeAttachments

Following solution worked for me without deleting pvc!

First, get the VolumeAttachment that is causing the problem

kubectl get VolumeAttachment | grep NAME_OF_PVC_IN_ERROR

You'll get the VolumeAttachment that is attaching your PV to specific node

Then delete this VolumeAttachment by

kubectl delete VolumeAttachment NAME_OF_VOLUME_ATTACHMENT

Then delete the pod forcefully, and next pod creation will work fine

kubectl delete pod --grace-period=0 --force POD_NAME

Upvotes: 5

Ali
Ali

Reputation: 1449

Here is the ONLY solution that worked for me:

  1. Scale your app to 0
kubectl -n NAMESPACE scale deployment/stateful NAME --replicas 0

or

kubectl -n NAMESPACE edit deployment/stateful NAME

And manually edit replicas: 0


Delete all volumeattachment:

kubectl -n NAMESPACE delete volumeattachment --all

Bring back the replicas.

That's it 🍻

Upvotes: 17

Rotem jackoby
Rotem jackoby

Reputation: 22088

I'll add an answer that will prevent this from happening again.

Short answer

Access modes: Switch from ReadWriteOnce to ReadWriteMany.


In a bit more details

You're usng a StatefulSet where each replica has its own state, with a unique persistent volume claim (PVC) created for each pod. Each PVC is referring to a Persistent Volume where you decided that the access mode is ReadWriteOnce.

Which as you can see from here:

ReadWriteOnce
the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.

So in case K8S Scheduler (due to priorities or resource calculations or due to a Cluster autoscaler which decided to shift the pod to a different node) - you will receive an error that the volume is already exclusively attached to one node and can't be attached to another node.

Please consider using ReadWriteMany where the volume can be mounted as read-write by many nodes.

Upvotes: -2

miu
miu

Reputation: 1304

I had the same issue right now and the problem was, that the node on which the pod is usually running on was down and another one took over (which didn't work as expected for whatever reason). Had the "node down" scenario a few times before already and it never caused any issues. Couldn't get the StatefulSet and Deployment back up and running without booting the node that was down. But as soon as the node was up and running again the StatefulSet and Deployment immediately came back to life as well.

Upvotes: 2

baatasaari
baatasaari

Reputation: 31

I had a similar error:

 The volume pvc-2885ea01-f4fb-11eb-9528-00505698bd8b 
   cannot be attached to the node node1 since it is already attached to the node node2*

I use longhorn as a storage provisioner and manager. So I just detached this pv in the error and restarted the stateful set. It automatically was able to attach to the pv correctly this time.

Upvotes: 1

Rico
Rico

Reputation: 61551

Looks like you have a dangling PVC and/or PV that is attached to one of your nodes. You can ssh into the node and run a df or mount to check.

If you look at this the PVCs in a StatefulSet are always mapped to their pod names, so it may be possible that you still have a dangling pod(?)

If you have a dangling pod:

$ kubectl -n test delete pod <pod-name>

You may have to force it:

$ kubectl -n test delete pod <pod-name> --grace-period=0 --force

Then, you can try deleting the PVC and it's corresponding PV:

$ kubectl delete pvc pvc-faedc8
$ kubectl delete pv <pv-name>

Upvotes: 21

Related Questions