Dmitry Minkovsky
Dmitry Minkovsky

Reputation: 38113

Kubernetes not spreading pods across available nodes

I have a GKE cluster with a single node pool of size 2. When I add a third node, none of the pods are distributed to that third node.

Here is the original 2-node node pool:

$ kubectl get node
NAME                              STATUS    ROLES     AGE       VERSION
gke-cluster0-pool-d59e9506-b7nb   Ready     <none>    13m       v1.8.3-gke.0
gke-cluster0-pool-d59e9506-vp6t   Ready     <none>    18m       v1.8.3-gke.0

And here are the pods running on that original node pool:

$ kubectl get po -o wide --all-namespaces
NAMESPACE     NAME                                         READY     STATUS      RESTARTS   AGE       IP           NODE
default       attachment-proxy-659bdc84d-ckdq9             1/1       Running     0          10m       10.0.38.3    gke-cluster0-pool-d59e9506-vp6t
default       elasticsearch-0                              1/1       Running     0          4m        10.0.39.11   gke-cluster0-pool-d59e9506-b7nb
default       front-webapp-646bc49675-86jj6                1/1       Running     0          10m       10.0.38.10   gke-cluster0-pool-d59e9506-vp6t
default       kafka-0                                      1/1       Running     3          4m        10.0.39.9    gke-cluster0-pool-d59e9506-b7nb
default       mailgun-http-98f8d997c-hhfdc                 1/1       Running     0          4m        10.0.38.17   gke-cluster0-pool-d59e9506-vp6t
default       stamps-5b6fc489bc-6xtqz                      2/2       Running        3          10m       10.0.38.13   gke-cluster0-pool-d59e9506-vp6t
default       user-elasticsearch-6b6dd7fc8-b55xx           1/1       Running     0          10m       10.0.38.4    gke-cluster0-pool-d59e9506-vp6t
default       user-http-analytics-6bdd49bd98-p5pd5         1/1       Running     0          4m        10.0.39.8    gke-cluster0-pool-d59e9506-b7nb
default       user-http-graphql-67884c678c-7dcdq           1/1       Running     0          4m        10.0.39.7    gke-cluster0-pool-d59e9506-b7nb
default       user-service-5cbb8cfb4f-t6zhv                1/1       Running     0          4m        10.0.38.15   gke-cluster0-pool-d59e9506-vp6t
default       user-streams-0                               1/1       Running     0          4m        10.0.39.10   gke-cluster0-pool-d59e9506-b7nb
default       user-streams-elasticsearch-c64b64d6f-2nrtl   1/1       Running     3          10m       10.0.38.6    gke-cluster0-pool-d59e9506-vp6t
default       zookeeper-0                                  1/1       Running     0          4m        10.0.39.12   gke-cluster0-pool-d59e9506-b7nb
kube-lego     kube-lego-7799f6b457-skkrc                   1/1       Running     0          10m       10.0.38.5    gke-cluster0-pool-d59e9506-vp6t
kube-system   event-exporter-v0.1.7-7cb7c5d4bf-vr52v       2/2       Running     0          10m       10.0.38.7    gke-cluster0-pool-d59e9506-vp6t
kube-system   fluentd-gcp-v2.0.9-648rh                     2/2       Running     0          14m       10.0.38.2    gke-cluster0-pool-d59e9506-vp6t
kube-system   fluentd-gcp-v2.0.9-fqjz6                     2/2       Running     0          9m        10.0.39.2    gke-cluster0-pool-d59e9506-b7nb
kube-system   heapster-v1.4.3-6fc45b6cc4-8cl72             3/3       Running     0          4m        10.0.39.6    gke-cluster0-pool-d59e9506-b7nb
kube-system   k8s-snapshots-5699c68696-h8r75               1/1       Running     0          4m        10.0.38.16   gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-dns-778977457c-b48w5                    3/3       Running     0          4m        10.0.39.5    gke-cluster0-pool-d59e9506-b7nb
kube-system   kube-dns-778977457c-sw672                    3/3       Running     0          10m       10.0.38.9    gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-dns-autoscaler-7db47cb9b7-tjt4l         1/1       Running     0          10m       10.0.38.11   gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-proxy-gke-cluster0-pool-d59e9506-b7nb   1/1       Running     0          9m        10.128.0.4   gke-cluster0-pool-d59e9506-b7nb
kube-system   kube-proxy-gke-cluster0-pool-d59e9506-vp6t   1/1       Running     0          14m       10.128.0.2   gke-cluster0-pool-d59e9506-vp6t
kube-system   kubernetes-dashboard-76c679977c-mwqlv        1/1       Running     0          10m       10.0.38.8    gke-cluster0-pool-d59e9506-vp6t
kube-system   l7-default-backend-6497bcdb4d-wkx28          1/1       Running     0          10m       10.0.38.12   gke-cluster0-pool-d59e9506-vp6t
kube-system   nginx-ingress-controller-78d546664f-gf6mx    1/1       Running     0          4m        10.0.39.3    gke-cluster0-pool-d59e9506-b7nb
kube-system   tiller-deploy-5458cb4cc-26x26                1/1       Running     0          4m        10.0.39.4    gke-cluster0-pool-d59e9506-b7nb

Then I add another node to the node pool:

gcloud container clusters resize cluster0 --node-pool pool --size 3

The third is added and ready:

NAME                              STATUS    ROLES     AGE       VERSION
gke-cluster0-pool-d59e9506-1rzm   Ready     <none>    3m        v1.8.3-gke.0
gke-cluster0-pool-d59e9506-b7nb   Ready     <none>    14m       v1.8.3-gke.0
gke-cluster0-pool-d59e9506-vp6t   Ready     <none>    19m       v1.8.3-gke.0

However, none of the pods except those belonging to DaemonSet are scheduled onto the added node:

$ kubectl get po -o wide --all-namespaces
NAMESPACE     NAME                                         READY     STATUS      RESTARTS   AGE       IP           NODE
default       attachment-proxy-659bdc84d-ckdq9             1/1       Running     0          17m       10.0.38.3    gke-cluster0-pool-d59e9506-vp6t
default       elasticsearch-0                              1/1       Running     0          10m       10.0.39.11   gke-cluster0-pool-d59e9506-b7nb
default       front-webapp-646bc49675-86jj6                1/1       Running     0          17m       10.0.38.10   gke-cluster0-pool-d59e9506-vp6t
default       kafka-0                                      1/1       Running     3          11m       10.0.39.9    gke-cluster0-pool-d59e9506-b7nb
default       mailgun-http-98f8d997c-hhfdc                 1/1       Running     0          10m       10.0.38.17   gke-cluster0-pool-d59e9506-vp6t
default       stamps-5b6fc489bc-6xtqz                      2/2       Running        3          16m       10.0.38.13   gke-cluster0-pool-d59e9506-vp6t
default       user-elasticsearch-6b6dd7fc8-b55xx           1/1       Running     0          17m       10.0.38.4    gke-cluster0-pool-d59e9506-vp6t
default       user-http-analytics-6bdd49bd98-p5pd5         1/1       Running     0          10m       10.0.39.8    gke-cluster0-pool-d59e9506-b7nb
default       user-http-graphql-67884c678c-7dcdq           1/1       Running     0          10m       10.0.39.7    gke-cluster0-pool-d59e9506-b7nb
default       user-service-5cbb8cfb4f-t6zhv                1/1       Running     0          10m       10.0.38.15   gke-cluster0-pool-d59e9506-vp6t
default       user-streams-0                               1/1       Running     0          10m       10.0.39.10   gke-cluster0-pool-d59e9506-b7nb
default       user-streams-elasticsearch-c64b64d6f-2nrtl   1/1       Running     3          17m       10.0.38.6    gke-cluster0-pool-d59e9506-vp6t
default       zookeeper-0                                  1/1       Running     0          10m       10.0.39.12   gke-cluster0-pool-d59e9506-b7nb
kube-lego     kube-lego-7799f6b457-skkrc                   1/1       Running     0          17m       10.0.38.5    gke-cluster0-pool-d59e9506-vp6t
kube-system   event-exporter-v0.1.7-7cb7c5d4bf-vr52v       2/2       Running     0          17m       10.0.38.7    gke-cluster0-pool-d59e9506-vp6t
kube-system   fluentd-gcp-v2.0.9-648rh                     2/2       Running     0          20m       10.0.38.2    gke-cluster0-pool-d59e9506-vp6t
kube-system   fluentd-gcp-v2.0.9-8tb4n                     2/2       Running     0          4m        10.0.40.2    gke-cluster0-pool-d59e9506-1rzm
kube-system   fluentd-gcp-v2.0.9-fqjz6                     2/2       Running     0          15m       10.0.39.2    gke-cluster0-pool-d59e9506-b7nb
kube-system   heapster-v1.4.3-6fc45b6cc4-8cl72             3/3       Running     0          11m       10.0.39.6    gke-cluster0-pool-d59e9506-b7nb
kube-system   k8s-snapshots-5699c68696-h8r75               1/1       Running     0          10m       10.0.38.16   gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-dns-778977457c-b48w5                    3/3       Running     0          11m       10.0.39.5    gke-cluster0-pool-d59e9506-b7nb
kube-system   kube-dns-778977457c-sw672                    3/3       Running     0          17m       10.0.38.9    gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-dns-autoscaler-7db47cb9b7-tjt4l         1/1       Running     0          17m       10.0.38.11   gke-cluster0-pool-d59e9506-vp6t
kube-system   kube-proxy-gke-cluster0-pool-d59e9506-1rzm   1/1       Running     0          4m        10.128.0.3   gke-cluster0-pool-d59e9506-1rzm
kube-system   kube-proxy-gke-cluster0-pool-d59e9506-b7nb   1/1       Running     0          15m       10.128.0.4   gke-cluster0-pool-d59e9506-b7nb
kube-system   kube-proxy-gke-cluster0-pool-d59e9506-vp6t   1/1       Running     0          20m       10.128.0.2   gke-cluster0-pool-d59e9506-vp6t
kube-system   kubernetes-dashboard-76c679977c-mwqlv        1/1       Running     0          17m       10.0.38.8    gke-cluster0-pool-d59e9506-vp6t
kube-system   l7-default-backend-6497bcdb4d-wkx28          1/1       Running     0          17m       10.0.38.12   gke-cluster0-pool-d59e9506-vp6t
kube-system   nginx-ingress-controller-78d546664f-gf6mx    1/1       Running     0          11m       10.0.39.3    gke-cluster0-pool-d59e9506-b7nb
kube-system   tiller-deploy-5458cb4cc-26x26                1/1       Running     0          11m       10.0.39.4    gke-cluster0-pool-d59e9506-b7nb

What is going on? Why are the pods not spreading onto the added node? I would have expected that the pods would be distributed to the third node. How can I get the workload to spread to this third node?

Technically, in terms of manifest resource requests, my entire application fits onto one node. But when the second node is added, the application is distributed to the second node. So I would think that when I add a third node, the pods with be scheduled onto that node as well. But that is not what I am seeing. Only DaemonSets are scheduled onto the third node. I have tried growing and shrinking the node pool to no avail.


Update

The two preemptible nodes restarted and now all the pods are on one node. What's going on? Is increasing resource requests the only way to make them spread out?

Upvotes: 0

Views: 946

Answers (2)

Hubert Chen
Hubert Chen

Reputation: 76

This is expected behavior. New pods would be scheduled onto empty nodes, but running pods aren't moved automatically. The kubernetes scheduler is generally conservative about rescheduling pods, so it won't do it without reason. Pods can be stateful(like a db), so kubernetes doesn't want to kill and reschedule a pod.

There is a project in development which will do what you're looking for: https://github.com/kubernetes-incubator/descheduler I haven't used it, but it is in active development from the community.

Upvotes: 5

DemiSheep
DemiSheep

Reputation: 698

I am a complete n00b here and am learning about Docker/Kubernetes and after reading your issue it sounds like you're having an issue with quorum. Have you tried starting up to 5 nodes? (n/2+1) Both Kubernetes and Docker Swarmkit uses Raft consensus algorithm. You may also want to check into Raft. This video might help you if it indeed matches your plight. It talks about Raft and Quorum. https://youtu.be/Qsv-q8WbIZY?t=2m58s

Upvotes: 0

Related Questions