Reputation: 220
I created a pod with resource/limits in autopilot cluster:
Limits:
cpu: 500m
ephemeral-storage: 1Gi
memory: 512Mi
Requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 512Mi
But based on what I read everything should be configured automatically. And I don't see how could I add new nodes to cluster.
Warning FailedScheduling 2m39s (x3979 over 4d3h) gke.io/optimize-utilization-scheduler 0/3 nodes are available: 1 Insufficient memory, 3 Insufficient cpu.
Normal NotTriggerScaleUp 85s (x68738 over 4d5h) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 node(s) didn't match node selector
Google console shows possible actions:
Increase maximum size limit for autoscaling in one or more node pools that have autoscaling enabled.
but this is autopilot and accroding to documentation it should be done automatically and I cannot do that at all. Very weird.
Upvotes: 8
Views: 3915
Reputation: 16336
This is hard to debug precisely without seeing your Podspec YAML. Getting a NotTriggerScaleUp
message however means that Autopilot will never add nodes for these pods and they will be stuck in Pending. Likely this is because there are conditions that can't be satisfied by the the autoscaler adding new nodes.
An example of an unsatisfiable condition would be a node-selector where the Pod is requesting to be placed in a zone that doesn't exist. Since the autoscaler can't create a node in a non-existent node, that Pod will forever sit in Pending.
When the autoscaler can provision resources for your Pod, you will see a TriggeredScaleUp
message (typically within ~10 seconds of the Pod entering the Pending state, but it could take a minute in some cases).
I wrote up a more general explanation of how pending pods work in Autopilot, and what you can look for.
Upvotes: 3