Reputation: 99
What I am most interested in is the information I get from docker node ls
. Where does the Docker store the information about the joined nodes?
Upvotes: 6
Views: 3264
Reputation: 3449
The information from docker node ls
could be found in the distributed datastore which handles Manager nodes membership.
When bootstrapping a cluster with a Manager node, you essentially create a single node database. Every additional Manager joining the cluster will add to that capacity and form a distributed consistent datastore (using an algorithm called Raft).
This distributed datastore ensures that the entire node membership state is consistent, in the presence of failures and partition.
When you join a new Agent node (dealing with docker Services/Tasks), this node and its information is added to the distributed datastore handled by the Manager nodes. Because Agents have a somewhat different roles than Managers, these nodes are stored differently (see the store/nodes section in Swarmkit).
To sum up:
+----------------------------------------------------------------------------------+
| Distributed Consistent Datastore |
| |
| +-----------------------------------------------+ |
| | Raft cluster membership store | |
| | | |
| +---^--------------------^------------------^---+ |
| | | | |
| +-------------+----+ +---------+--------+ +---+--------------+ |
| | | | | | | |
| | Manager | | Manager | | Manager | |
| | | | | | | |
| +------------------+ +------------------+ +------------------+ |
| +-----------------------------------------------+ |
| | Node Membership store | |
| | | |
| +-----^-------------^-------------^----------^--+ |
| | | | | |
+----------------------------------------------------------------------------------+
| | | |
+---------+---+----+----+ +----+----+ +----+---+ +--+-----+--+--------+
| | | | | | | | | | | |
| Agent | | Agent | | Agent | | Agent | | Agent | | Agent |
| | | | | | | | | | | |
+---------+ +---------+ +---------+ +--------+ +--------+ +--------+
Because the distributed store is using the Raft consensus algorithm, if you lose a majority of Manager nodes, you cannot process any more update and add new nodes (both Manager and Agents). This is to avoid inconsistent data, where a minority of Managers have their state diverging from the majority during a network partition. Indeed, this would be bad to have Managers ending up with a different list of nodes because they were all stuck into a partition but they individually kept adding nodes to their local stores without "synchronizing" this list among themselves.
When rebooting all Swarm managers, it will just stop processing new data and nodes joining the cluster until a majority of the Manager nodes have successfully rebooted and could once again contact each other. Because we recovered a majority after reboot, we can safely process new updates and add new nodes to the cluster. The minority still in the reboot process will thus have to catch up with the majority when that process is done.
Upvotes: 8