Why do we need an 'arbiter' in MongoDB replication?

mongodb replication

卢声远 Shengyuan Lu · Aug 13, 2013 · Viewed 22.5k times · Source

Assume we setup a MongoDB replication without arbiter, If the primary is unavailable, the replica set will elect a secondary to be primary. So I think it's kind of implicit arbiter, since the replica will elect a primary automatically.

So I am wondering why do we need a dedicated arbiter node? Thanks!

Answer

I created a spreadsheet to better illustrate the effect of Arbiter nodes in a Replica Set.

It basically comes down to these points:

With an RS of 2 data nodes, losing 1 server brings you below your voting minimum (which is "greater than N/2"). An arbiter solves this.
With an RS of even numbered data nodes, adding an Arbiter increases your fault tolerance by 1 without making it possible to have 2 voting clusters due to a split.
With an RS of odd numbered data nodes, adding an Arbiter would allow a split to create 2 isolated clusters with "greater than N/2" votes and therefore a split brain scenario.

Elections are explained [in poor] detail here. In that document it states that an RS can have 50 members (even number) and 7 voting members. I emphasize "states" because it does not explain how it works. To me it seems that if you have a split happen with 4 members (all voting) on one side and 46 members (3 voting) on the other, you'd rather have the 46 elect a primary and the 4 to be a read-only cluster. But, that's exactly what "limited voting" prevents. In that situation you will actually have a 4 member cluster with a primary and a 46 member cluster that is read only. Explaining how that makes sense is out of the scope of this question and beyond my knowledge.

Why do we need an 'arbiter' in MongoDB replication?

Answer

Related questions