Reputation: 351
I have kubernetes HA environment with three masters. Just have a test, shutdown two masters(kill the apiserver/kcm/scheduler process), then only one master can work well. I can use kubectl to create a deployment successfully ,some pods were scheduled to different nodes and start. So can anyone explain why it is advised odd number of masters? Thanks.
Upvotes: 5
Views: 7497
Reputation: 1005
Short answer: To have higher fault tolerence for etcd.
Etcd uses RAFT for leader selection. An etcd cluster needs a majority of nodes, a quorum, to agree on a leader. For a cluster with n members, quorum is (n/2)+1.
In terms of fault tolerance, adding an additional node to an odd-sized cluster decreases the fault tolerance. How? We still have the same number of nodes that may fail without losing quorum however we have more nodes that can fail which means possibility of losing quorum is actually higher than before.
For fault tolerance please check this official etcd doc for more information.
Upvotes: 3
Reputation: 54181
Because if you have an even number of servers, it's a lot easier to end up in a situation where the network breaks and you have exactly 50% on each side. With an odd number, you can't (easily) have a situation where more than one partition in the network thinks it has majority control.
Upvotes: 9