Reputation: 982
I would like to reboot my CoreOS cluster nodes one by one, as I read many bad things of rebooting all nodes at once are not good (etcd, ceph could not keep a quorum, etc) What is the proper way of doing this, other than going into each machine manually and issue reboot
command?
Is there a generic way to reboot n nodes in a cluster, wait for them to be up, and then another set of n nodes, until all nodes are rebooted?
Thank you.
Upvotes: 3
Views: 4735
Reputation: 4433
Locksmith is the daemon for rebooting a CoreOS node. I recommend to pick the etcd-lock reboot strategy:
coreos:
update:
reboot-strategy: etcd-lock
By default this will reboot the cluster 1 by 1. I'm using fleetctl to remote control my CoreOS cluster. This script will send the reboot signal to all machines in the cluster:
#!/bin/bash -x
for machine in $(fleetctl list-machines --no-legend --full | awk '{ print $1;}'); do
fleetctl ssh $machine "sudo locksmithctl reboot"
done
If your reboot-strategy is etcd-lock the nodes will not reboot immediately. They will reboot 1 by 1 until the whole cluster rebooted.
Upvotes: 2
Reputation: 61
In the cloud-config.yaml file you could add:
coreos:
update:
reboot-strategy: etcd-lock
which means that the machines in your cluster will acquire a lock before rebooting to ensure that no more then 1 machine is rebooted each time. Please refer to the documentation for additional informations: https://coreos.com/docs/cluster-management/setup/update-strategies/
Upvotes: 4