Reputation: 819
I am struggling with a volumeattach error. I have a regional persistent disk which is in the same GCP project as my regional GKE cluster. My regional cluster is in europe-west2 with nodes in europe-west2-a, b and c. the regional disk is replicated across zones europe-west2-b and c.
I have a nfs-server deployment manifest which refers to the gcePersistantDisk.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: []
labels:
app.kubernetes.io/managed-by: Helm
name: nfs-server
namespace: namespace
spec:
progressDeadlineSeconds: 600
replicas: 1
selector:
matchLabels:
role: nfs-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
role: nfs-server
spec:
serviceAccountName: nfs-server
containers:
- image: gcr.io/google_containers/volume-nfs:0.8
imagePullPolicy: IfNotPresent
name: nfs-server
ports:
- containerPort: 2049
name: nfs
protocol: TCP
- containerPort: 20048
name: mountd
protocol: TCP
- containerPort: 111
name: rpcbind
protocol: TCP
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: nfs-pvc
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-regional-disk-name
name: nfs-pvc
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution :
nodeSelectorTerms:
- matchExpressions:
- key: topology.gke.io/zone
operator: In
values:
- europe-west2-b
- europe-west2-c
and my pv/pvc
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 200Gi
nfs:
path: /
server: nfs-server.namespace.svc.cluster.local
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app.kubernetes.io/managed-by: Helm
name: nfs-pvc
namespace: namespace
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
storageClassName: ""
volumeMode: Filesystem
volumeName: nfs-pv
When I apply my deployment manifest above I get the following error:
'rpc error: code = Unavailable desc = ControllerPublish not permitted on node "projects/ap-mc-qa-xxx-xxxx/zones/europe-west2-a/instances/node-instance-id" due to backoff condition'
The volume attachment tells me this:
Attach Error: Message: rpc error: code = NotFound desc = ControllerPublishVolume could not find volume with ID projects/UNSPECIFIED/zones/UNSPECIFIED/disks/my-regional-disk-name: googleapi: Error 0: , notFound
These manifests seemed to work fine when it was deployed for a zonal cluster/disk. I've checked things like making sure the cluster svc acct has the necessary permissions. Disk is currently not in use.
What am I missing???
Upvotes: 0
Views: 1401
Reputation: 819
So the reason that the above won't work is because a regional persistant disk feature allows the creation of persistent disks that are available in 2 zones within the same region. In order to use that feature, the volume must be provisioned as a PersistentVolume; referencing the volume directly from a pod is not supported. Something like this:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
gcePersistentDisk:
pdName: my-regional-disk
fsType: ext4
Now trying to figure out how to re-configure the NFS sever to use a regional disk.
Upvotes: 0
Reputation: 1167
I think we should focus on the type of Nodes that make up your Kubernetes cluster.
Regional persistent disks are restricted from being used with memory-optimized machines or compute-optimized machines.
Consider using a non-regional persistent disk storage class if using a regional persistent disk is not a hard requirement. If using a regional persistent disk is a hard requirement, consider scheduling strategies such as taints and tolerations to ensure that the Pods that need regional PD are scheduled on a node pool that are not optimized machines.
Upvotes: 0