卢声远 Shengyuan Lu
卢声远 Shengyuan Lu

Reputation: 32004

Why do we need an 'arbiter' in MongoDB replication?

Assume we setup a MongoDB replication without arbiter, If the primary is unavailable, the replica set will elect a secondary to be primary. So I think it's kind of implicit arbiter, since the replica will elect a primary automatically.

So I am wondering why do we need a dedicated arbiter node? Thanks!

Upvotes: 33

Views: 32969

Answers (4)

Jerry
Jerry

Reputation: 8111

Its necessary to have a arbiter in a replication for the below reasons:

  • Replication is more reliable if it has odd number of replica sets. Incase if there is even number of replica sets its better to add a arbiter in the replication.
  • Arbiters do not hold data in them and they are just to vote in election when there is any node failure.
  • Arbiter is a light weight process they do not consume much hardware resources.
  • Arbiters just exchange the user credentials data between the replica set which are encrypted.
  • Vote during elections,hearbeats and configureation data are not encrypted while communicating in between the replica sets.
  • It is better to run arbiter on a separate machine rather than along with any one of the replica set to retain high availability.

Hope this helps !!!

Upvotes: 10

Sammaye
Sammaye

Reputation: 43884

This really comes down to the CAP theorem whereby it is stated that if there are equal number of servers on either side of the partition the database cannot maintain CAP (Consistency, Availability, and Partition tolerance). An Arbiter is specifically designed to create an "imbalance" or majority on one side so that a primary can be elected in this case.

If you get an even number of nodes on either side MongoDB will not elect a primary and your set will not accept writes.

Edit

By either side I mean, for example, 2 on one side and 2 on the other. My English wasn't easy to understand there.

So really what I mean is both sides.

Edit

Wikipedia presents quite a good case for explaining CAP: http://en.wikipedia.org/wiki/CAP_theorem

Upvotes: 9

Bruno Bronosky
Bruno Bronosky

Reputation: 70349

I created a spreadsheet to better illustrate the effect of Arbiter nodes in a Replica Set.

enter image description here

It basically comes down to these points:

  1. With an RS of 2 data nodes, losing 1 server brings you below your voting minimum (which is "greater than N/2"). An arbiter solves this.
  2. With an RS of even numbered data nodes, adding an Arbiter increases your fault tolerance by 1 without making it possible to have 2 voting clusters due to a split.
  3. With an RS of odd numbered data nodes, adding an Arbiter would allow a split to create 2 isolated clusters with "greater than N/2" votes and therefore a split brain scenario.

Elections are explained [in poor] detail here. In that document it states that an RS can have 50 members (even number) and 7 voting members. I emphasize "states" because it does not explain how it works. To me it seems that if you have a split happen with 4 members (all voting) on one side and 46 members (3 voting) on the other, you'd rather have the 46 elect a primary and the 4 to be a read-only cluster. But, that's exactly what "limited voting" prevents. In that situation you will actually have a 4 member cluster with a primary and a 46 member cluster that is read only. Explaining how that makes sense is out of the scope of this question and beyond my knowledge.

Upvotes: 20

Adil
Adil

Reputation: 2112

Arbiters are an optional mechanism to allow voting to succeed when you have an even number of mongods deployed in a replicaset. Arbiters are light weight, meant to be deployed on a server that is NOT a dedicated mongo replica, i.e: the server's primary role is some other task, like a redis server. Since they're light they won't interfere (noticeably) with the system's resources.

From the docs :

An arbiter does not have a copy of data set and cannot become a primary. Replica sets may have arbiters to add a vote in elections of for primary. Arbiters allow replica sets to have an uneven number of members, without the overhead of a member that replicates data.

Upvotes: 4

Related Questions