Reputation: 38113
I have a GKE cluster with a single node pool of size 2. When I add a third node, none of the pods are distributed to that third node.
Here is the original 2-node node pool:
$ kubectl get node
NAME STATUS ROLES AGE VERSION
gke-cluster0-pool-d59e9506-b7nb Ready <none> 13m v1.8.3-gke.0
gke-cluster0-pool-d59e9506-vp6t Ready <none> 18m v1.8.3-gke.0
And here are the pods running on that original node pool:
$ kubectl get po -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default attachment-proxy-659bdc84d-ckdq9 1/1 Running 0 10m 10.0.38.3 gke-cluster0-pool-d59e9506-vp6t
default elasticsearch-0 1/1 Running 0 4m 10.0.39.11 gke-cluster0-pool-d59e9506-b7nb
default front-webapp-646bc49675-86jj6 1/1 Running 0 10m 10.0.38.10 gke-cluster0-pool-d59e9506-vp6t
default kafka-0 1/1 Running 3 4m 10.0.39.9 gke-cluster0-pool-d59e9506-b7nb
default mailgun-http-98f8d997c-hhfdc 1/1 Running 0 4m 10.0.38.17 gke-cluster0-pool-d59e9506-vp6t
default stamps-5b6fc489bc-6xtqz 2/2 Running 3 10m 10.0.38.13 gke-cluster0-pool-d59e9506-vp6t
default user-elasticsearch-6b6dd7fc8-b55xx 1/1 Running 0 10m 10.0.38.4 gke-cluster0-pool-d59e9506-vp6t
default user-http-analytics-6bdd49bd98-p5pd5 1/1 Running 0 4m 10.0.39.8 gke-cluster0-pool-d59e9506-b7nb
default user-http-graphql-67884c678c-7dcdq 1/1 Running 0 4m 10.0.39.7 gke-cluster0-pool-d59e9506-b7nb
default user-service-5cbb8cfb4f-t6zhv 1/1 Running 0 4m 10.0.38.15 gke-cluster0-pool-d59e9506-vp6t
default user-streams-0 1/1 Running 0 4m 10.0.39.10 gke-cluster0-pool-d59e9506-b7nb
default user-streams-elasticsearch-c64b64d6f-2nrtl 1/1 Running 3 10m 10.0.38.6 gke-cluster0-pool-d59e9506-vp6t
default zookeeper-0 1/1 Running 0 4m 10.0.39.12 gke-cluster0-pool-d59e9506-b7nb
kube-lego kube-lego-7799f6b457-skkrc 1/1 Running 0 10m 10.0.38.5 gke-cluster0-pool-d59e9506-vp6t
kube-system event-exporter-v0.1.7-7cb7c5d4bf-vr52v 2/2 Running 0 10m 10.0.38.7 gke-cluster0-pool-d59e9506-vp6t
kube-system fluentd-gcp-v2.0.9-648rh 2/2 Running 0 14m 10.0.38.2 gke-cluster0-pool-d59e9506-vp6t
kube-system fluentd-gcp-v2.0.9-fqjz6 2/2 Running 0 9m 10.0.39.2 gke-cluster0-pool-d59e9506-b7nb
kube-system heapster-v1.4.3-6fc45b6cc4-8cl72 3/3 Running 0 4m 10.0.39.6 gke-cluster0-pool-d59e9506-b7nb
kube-system k8s-snapshots-5699c68696-h8r75 1/1 Running 0 4m 10.0.38.16 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-dns-778977457c-b48w5 3/3 Running 0 4m 10.0.39.5 gke-cluster0-pool-d59e9506-b7nb
kube-system kube-dns-778977457c-sw672 3/3 Running 0 10m 10.0.38.9 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-dns-autoscaler-7db47cb9b7-tjt4l 1/1 Running 0 10m 10.0.38.11 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-proxy-gke-cluster0-pool-d59e9506-b7nb 1/1 Running 0 9m 10.128.0.4 gke-cluster0-pool-d59e9506-b7nb
kube-system kube-proxy-gke-cluster0-pool-d59e9506-vp6t 1/1 Running 0 14m 10.128.0.2 gke-cluster0-pool-d59e9506-vp6t
kube-system kubernetes-dashboard-76c679977c-mwqlv 1/1 Running 0 10m 10.0.38.8 gke-cluster0-pool-d59e9506-vp6t
kube-system l7-default-backend-6497bcdb4d-wkx28 1/1 Running 0 10m 10.0.38.12 gke-cluster0-pool-d59e9506-vp6t
kube-system nginx-ingress-controller-78d546664f-gf6mx 1/1 Running 0 4m 10.0.39.3 gke-cluster0-pool-d59e9506-b7nb
kube-system tiller-deploy-5458cb4cc-26x26 1/1 Running 0 4m 10.0.39.4 gke-cluster0-pool-d59e9506-b7nb
Then I add another node to the node pool:
gcloud container clusters resize cluster0 --node-pool pool --size 3
The third is added and ready:
NAME STATUS ROLES AGE VERSION
gke-cluster0-pool-d59e9506-1rzm Ready <none> 3m v1.8.3-gke.0
gke-cluster0-pool-d59e9506-b7nb Ready <none> 14m v1.8.3-gke.0
gke-cluster0-pool-d59e9506-vp6t Ready <none> 19m v1.8.3-gke.0
However, none of the pods except those belonging to DaemonSet
are scheduled onto the added node:
$ kubectl get po -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
default attachment-proxy-659bdc84d-ckdq9 1/1 Running 0 17m 10.0.38.3 gke-cluster0-pool-d59e9506-vp6t
default elasticsearch-0 1/1 Running 0 10m 10.0.39.11 gke-cluster0-pool-d59e9506-b7nb
default front-webapp-646bc49675-86jj6 1/1 Running 0 17m 10.0.38.10 gke-cluster0-pool-d59e9506-vp6t
default kafka-0 1/1 Running 3 11m 10.0.39.9 gke-cluster0-pool-d59e9506-b7nb
default mailgun-http-98f8d997c-hhfdc 1/1 Running 0 10m 10.0.38.17 gke-cluster0-pool-d59e9506-vp6t
default stamps-5b6fc489bc-6xtqz 2/2 Running 3 16m 10.0.38.13 gke-cluster0-pool-d59e9506-vp6t
default user-elasticsearch-6b6dd7fc8-b55xx 1/1 Running 0 17m 10.0.38.4 gke-cluster0-pool-d59e9506-vp6t
default user-http-analytics-6bdd49bd98-p5pd5 1/1 Running 0 10m 10.0.39.8 gke-cluster0-pool-d59e9506-b7nb
default user-http-graphql-67884c678c-7dcdq 1/1 Running 0 10m 10.0.39.7 gke-cluster0-pool-d59e9506-b7nb
default user-service-5cbb8cfb4f-t6zhv 1/1 Running 0 10m 10.0.38.15 gke-cluster0-pool-d59e9506-vp6t
default user-streams-0 1/1 Running 0 10m 10.0.39.10 gke-cluster0-pool-d59e9506-b7nb
default user-streams-elasticsearch-c64b64d6f-2nrtl 1/1 Running 3 17m 10.0.38.6 gke-cluster0-pool-d59e9506-vp6t
default zookeeper-0 1/1 Running 0 10m 10.0.39.12 gke-cluster0-pool-d59e9506-b7nb
kube-lego kube-lego-7799f6b457-skkrc 1/1 Running 0 17m 10.0.38.5 gke-cluster0-pool-d59e9506-vp6t
kube-system event-exporter-v0.1.7-7cb7c5d4bf-vr52v 2/2 Running 0 17m 10.0.38.7 gke-cluster0-pool-d59e9506-vp6t
kube-system fluentd-gcp-v2.0.9-648rh 2/2 Running 0 20m 10.0.38.2 gke-cluster0-pool-d59e9506-vp6t
kube-system fluentd-gcp-v2.0.9-8tb4n 2/2 Running 0 4m 10.0.40.2 gke-cluster0-pool-d59e9506-1rzm
kube-system fluentd-gcp-v2.0.9-fqjz6 2/2 Running 0 15m 10.0.39.2 gke-cluster0-pool-d59e9506-b7nb
kube-system heapster-v1.4.3-6fc45b6cc4-8cl72 3/3 Running 0 11m 10.0.39.6 gke-cluster0-pool-d59e9506-b7nb
kube-system k8s-snapshots-5699c68696-h8r75 1/1 Running 0 10m 10.0.38.16 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-dns-778977457c-b48w5 3/3 Running 0 11m 10.0.39.5 gke-cluster0-pool-d59e9506-b7nb
kube-system kube-dns-778977457c-sw672 3/3 Running 0 17m 10.0.38.9 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-dns-autoscaler-7db47cb9b7-tjt4l 1/1 Running 0 17m 10.0.38.11 gke-cluster0-pool-d59e9506-vp6t
kube-system kube-proxy-gke-cluster0-pool-d59e9506-1rzm 1/1 Running 0 4m 10.128.0.3 gke-cluster0-pool-d59e9506-1rzm
kube-system kube-proxy-gke-cluster0-pool-d59e9506-b7nb 1/1 Running 0 15m 10.128.0.4 gke-cluster0-pool-d59e9506-b7nb
kube-system kube-proxy-gke-cluster0-pool-d59e9506-vp6t 1/1 Running 0 20m 10.128.0.2 gke-cluster0-pool-d59e9506-vp6t
kube-system kubernetes-dashboard-76c679977c-mwqlv 1/1 Running 0 17m 10.0.38.8 gke-cluster0-pool-d59e9506-vp6t
kube-system l7-default-backend-6497bcdb4d-wkx28 1/1 Running 0 17m 10.0.38.12 gke-cluster0-pool-d59e9506-vp6t
kube-system nginx-ingress-controller-78d546664f-gf6mx 1/1 Running 0 11m 10.0.39.3 gke-cluster0-pool-d59e9506-b7nb
kube-system tiller-deploy-5458cb4cc-26x26 1/1 Running 0 11m 10.0.39.4 gke-cluster0-pool-d59e9506-b7nb
What is going on? Why are the pods not spreading onto the added node? I would have expected that the pods would be distributed to the third node. How can I get the workload to spread to this third node?
Technically, in terms of manifest resource requests, my entire application fits onto one node. But when the second node is added, the application is distributed to the second node. So I would think that when I add a third node, the pods with be scheduled onto that node as well. But that is not what I am seeing. Only DaemonSet
s are scheduled onto the third node. I have tried growing and shrinking the node pool to no avail.
Update
The two preemptible nodes restarted and now all the pods are on one node. What's going on? Is increasing resource requests the only way to make them spread out?
Upvotes: 0
Views: 946
Reputation: 76
This is expected behavior. New pods would be scheduled onto empty nodes, but running pods aren't moved automatically. The kubernetes scheduler is generally conservative about rescheduling pods, so it won't do it without reason. Pods can be stateful(like a db), so kubernetes doesn't want to kill and reschedule a pod.
There is a project in development which will do what you're looking for: https://github.com/kubernetes-incubator/descheduler I haven't used it, but it is in active development from the community.
Upvotes: 5
Reputation: 698
I am a complete n00b here and am learning about Docker/Kubernetes and after reading your issue it sounds like you're having an issue with quorum. Have you tried starting up to 5 nodes? (n/2+1) Both Kubernetes and Docker Swarmkit uses Raft consensus algorithm. You may also want to check into Raft. This video might help you if it indeed matches your plight. It talks about Raft and Quorum. https://youtu.be/Qsv-q8WbIZY?t=2m58s
Upvotes: 0