Reputation: 304
I'm getting a strange problem on my Terraformed GKE cluster,
I have a deployment that request a GcePersistentVolume with a PVC, when it got created, I have a Can’t scale up nodes
notification on my GCloud console.
If I inspect the log, it say that :
reason: {
messageId: "no.scale.up.mig.failing.predicate"
parameters: [
0: ""
1: "pod has unbound immediate PersistentVolumeClaims"
Without creating this deployment, I have no Scale UP error at all.
The PVC in question :
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
finalizers:
- kubernetes.io/pvc-protection
name: nfs
namespace: nfs
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
volumeMode: Filesystem
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
phase: Bound
My deployment is running fine and the PV is directly created and bound to my PVC.
So I find this Can't Scale up Nodes
really strange ?
(It's a single zone cluster, with a single NodePool).
Any idea for me ?
Thanks a lot
Upvotes: 1
Views: 2568
Reputation: 31
I'm having the same problem. It is weird because if you are creating a PVC in GKE, the PV is created dynamically (and indeed it is), so you go and check with kubectl get pv,pvc --all-namespaces
and everything seems normal. But it seems that when a deployment (that uses a PVC) is created and while waiting for the creation of the PVC this error appears and the cluster acknowledges it and displays the alert (creating some false positive alerts). It seems like a timing issue.
One turnaround is to change the value of the storageClassName
definition. If instead of standard
you use standard-rwo
(both appear as default in Storage Classes tab in Storage) the problem seems to disappear. The consequence of this is that the type of the underlying disk changes from Standard persistent disk
to Balanced persistent disk
. Anyhow, the latter one performs better.
EDIT:
It is about Storage Classes. The volumeBindingMode of the default standard
class is Immediate
. According to the documentation:
The Immediate mode indicates that volume binding and dynamic provisioning occurs once the PersistentVolumeClaim is created. For storage backends that are topology-constrained and not globally accessible from all Nodes in the cluster, PersistentVolumes will be bound or provisioned without knowledge of the Pod's scheduling requirements. This may result in unschedulable Pods.
A cluster administrator can address this issue by specifying the WaitForFirstConsumer mode which will delay the binding and provisioning of a PersistentVolume until a Pod using the PersistentVolumeClaim is created. PersistentVolumes will be selected or provisioned conforming to the topology that is specified by the Pod's scheduling constraints. These include, but are not limited to, resource requirements, node selectors, pod affinity and anti-affinity, and taints and tolerations.
So, if all the properties of the standard
Storage class
are required to be kept, another solution would be to create another Storage class
:
standard
Storage class
name
definitionvolumeBindingMode: Immediate
to volumeBindingMode: WaitForFirstConsumer
.kubectl apply -f <file path>
)storageClassName
definition of the PVC, change it to the name of the step #2Upvotes: 3