Dmitry Minkovsky
Dmitry Minkovsky

Reputation: 38143

Preemptible node is sometimes failing to join the GKE cluster

I have a preemptible node pool of size 1 on GKE. I've been running this node pool with size 1 for almost a month now. Every day the node restarts after 24 hours and rejoins the cluster. Today it restarted but did not rejoin the cluster.

Instead, I noticed that according to gcloud compute instances list the underlying instance was running but not included in the output of kubectl get node. I increased the node pool size to 2, whereupon a second instance was launched. That node immediately joined my GKE cluster and pods were scheduled onto it. The first node is still running according to gcloud, but it won't join the cluster.

What's going on? How can I debug this this problem?


Update:

I SSHed into the instance and was immediately greeted with this excellent error message:

Broken (or in progress) Kubernetes node setup! Check the cluster initialization status
using the following commands:

Master instance:
  - sudo systemctl status kube-master-installation
  - sudo systemctl status kube-master-configuration

Node instance:
  - sudo systemctl status kube-node-installation
  - sudo systemctl status kube-node-configuration

The results of sudo systemctl status kube-node-installation:

goto mark: ● kube-node-installation.service - Download and install k8s binaries and configurations
   Loaded: loaded (/etc/systemd/system/kube-node-installation.service; enabled; vendor preset: disabled)
   Active: active (exited) since Thu 2017-12-28 21:08:53 UTC; 6h ago
  Process: 945 ExecStart=/home/kubernetes/bin/configure.sh (code=exited, status=0/SUCCESS)
  Process: 941 ExecStartPre=/bin/chmod 544 /home/kubernetes/bin/configure.sh (code=exited, status=0/SUCCESS)
  Process: 937 ExecStartPre=/usr/bin/curl --fail --retry 5 --retry-delay 3 --silent --show-error -H X-Google-Metadata-Request: True -o /home/kubernetes/bin/configure.sh http://metadata.google.internal/com
puteMetadata/v1/instance/attributes/configure-sh (code=exited, status=0/SUCCESS)
  Process: 933 ExecStartPre=/bin/mount -o remount,exec /home/kubernetes/bin (code=exited, status=0/SUCCESS)
  Process: 930 ExecStartPre=/bin/mount --bind /home/kubernetes/bin /home/kubernetes/bin (code=exited, status=0/SUCCESS)
  Process: 925 ExecStartPre=/bin/mkdir -p /home/kubernetes/bin (code=exited, status=0/SUCCESS)
 Main PID: 945 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   Memory: 0B
      CPU: 0
   CGroup: /system.slice/kube-node-installation.service

Dec 28 21:08:52 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: Downloading node problem detector.
Dec 28 21:08:52 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dec 28 21:08:52 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]:                                  Dload  Upload   Total   Spent    Left  Speed
Dec 28 21:08:52 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: [158B blob data]
Dec 28 21:08:52 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: == Downloaded https://storage.googleapis.com/kubernetes-release/node-problem-detector/node-problem-detector-v0.4.1.tar.gz (SHA1 = a57a3fe
64cab8a18ec654f5cef0aec59dae62568) ==
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: cni-0799f5732f2a11b329d9e3d51b9c8f2e3759f2ff.tar.gz is preloaded.
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: kubernetes-manifests.tar.gz is preloaded.
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: mounter is preloaded.
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure.sh[945]: Done for installing kubernetes files
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: Started Download and install k8s binaries and configurations.

And the result of sudo systemctl status kube-node-configuration:

● kube-node-configuration.service - Configure kubernetes node
   Loaded: loaded (/etc/systemd/system/kube-node-configuration.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2017-12-28 21:08:53 UTC; 6h ago
  Process: 994 ExecStart=/home/kubernetes/bin/configure-helper.sh (code=exited, status=4)
  Process: 990 ExecStartPre=/bin/chmod 544 /home/kubernetes/bin/configure-helper.sh (code=exited, status=0/SUCCESS)
 Main PID: 994 (code=exited, status=4)
      CPU: 33ms

Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: Starting Configure kubernetes node...
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[994]: Start to configure instance for kubernetes
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[994]: Configuring IP firewall rules
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[994]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[994]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[994]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: kube-node-configuration.service: Main process exited, code=exited, status=4/NOPERMISSION
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: Failed to start Configure kubernetes node.
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: kube-node-configuration.service: Unit entered failed state.
Dec 28 21:08:53 gke-cluster0-pool-d59e9506-g9sc systemd[1]: kube-node-configuration.service: Failed with result 'exit-code'.

So it looks like kube-node-configuration failed. I ran sudo systemctl restart kube-node-configuration and now the status output is:

● kube-node-configuration.service - Configure kubernetes node
   Loaded: loaded (/etc/systemd/system/kube-node-configuration.service; enabled; vendor preset: disabled)
   Active: active (exited) since Fri 2017-12-29 03:41:36 UTC; 3s ago
 Main PID: 20802 (code=exited, status=0/SUCCESS)
      CPU: 1.851s

Dec 29 03:41:28 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Extend the docker.service configuration to set a higher pids limit
Dec 29 03:41:28 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Docker command line is updated. Restart docker to pick it up
Dec 29 03:41:30 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Start kubelet
Dec 29 03:41:35 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Using kubelet binary at /home/kubernetes/bin/kubelet
Dec 29 03:41:35 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Start kube-proxy static pod
Dec 29 03:41:35 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Start node problem detector
Dec 29 03:41:35 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Using node problem detector binary at /home/kubernetes/bin/node-problem-detector
Dec 29 03:41:36 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Prepare containerized mounter
Dec 29 03:41:36 gke-cluster0-pool-d59e9506-g9sc configure-helper.sh[20802]: Done for the configuration for kubernetes
Dec 29 03:41:36 gke-cluster0-pool-d59e9506-g9sc systemd[1]: Started Configure kubernetes node.

...and the node joined the cluster :). But, my original question stands: what happened?

Upvotes: 2

Views: 917

Answers (1)

GrandVizier
GrandVizier

Reputation: 639

We were experiencing a similar problem on GKE with preemptible nodes, seeing error messaging like these from the nodes:

Extend the docker.service configuration to set a higher pids limit
Docker command line is updated. Restart docker to pick it up
level=info msg="Processing signal 'terminated'"
level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
level=info msg="Daemon shutdown complete"
docker daemon exited
Start kubelet

After about a month of back-and-forth with Google Support, we learned that the Nodes were getting preempted and replaced, and the new node that comes in, uses the same name, and it all happens without the normal pod disruption of a node being evicted.


Backstory: we were running into this problem because Jenkins was running it's workers on the nodes, and during this ~2 minute "restart" of the node going and returning, Jenkins master would loose connection and fail the job.

tldr; don't use preemptible nodes for this kind of work.

Upvotes: 1

Related Questions