Reputation: 135
Recently I'm learning Paxos
, until now I already have a basic understanding of how it works. But can anyone explain how Paxos handles packet loss and a new node joining? Could be better if a simple example is provided.
Upvotes: 1
Views: 491
Reputation: 6832
As pointed out in other answers message loss or message reordering is handled by the algorithm: it is designed to exactly to handle those cases.
New nodes joining is a matter of "cluster membership changes". There is a common misconception that cluster membership changes are not covered by Paxos; yet they are described in the 2001 paper Paxos Made Simple in the last paragraph. In this blog post I discuss it. There is a question of how a new node gets a copy of all the state when it joins the cluster. That is discussed in this answer.
Upvotes: 0
Reputation: 7864
The classical Paxos algorithm does not have a concept of "new nodes joining". Some Paoxs variants do, such as "Vertical Paxos", but the classic algorithm requires that all nodes be statically defined before running the algorithm. With respect to packet loss, Paxos uses a very simple infinite loop: "try a round of the algorithm, if anything at all goes wrong, try another round". So if too many packets are lost in the 1st attempt at achieving resolution (which can be detected via a simple timeout on waiting for replies), a second round can be attempted. If the timeout for that round expires, try again, and so on.
Exactly how packet loss is to be detected and handled is something the Paxos algorithm leaves undefined. It's an implementation-specific detail. This is actually a good thing for production environments since how this is handled can have a pretty big performance impact on Paxos-based systems.
Upvotes: 2
Reputation: 741
About packet loss, Paxos uses the next assumption about network:
Messages may be lost, reordered, or duplicated.
This is solved via quorums. At least X of all Acceptors must accept a value in order for the system to accept it. This also solves the issue when a node if failing.
About new node joining, Paxos is not focus about how the node detects other nodes. That is a problem solved by other algorithms.
They automagically know all the nodes and each one's role
If you want, for production code implementation, you can use Zookeeper to solve this new node detection.
Upvotes: 0