Why MongoDB config servers must be one or three only?

sephiroth66 picture sephiroth66 · Apr 26, 2013 · Viewed 12.6k times · Source

After reading the official documentation for the MongoDB sharding architecture I have not found out why you need to have one or three config servers, and not another number.

The MongoDB documentation on Config Servers says:

"If one or two config servers become unavailable, the cluster’s metadata becomes read only. You can still read and write data from the shards, but no chunk migrations or splits will occur until all three servers are available."

Hence the reflection: one server is equivalent to a single point of failure, but with two servers we have the same behavior as three, right?

So, why absolutely three servers and not only two or more, in example?

Because the doc says also:

Config servers do not run as replica sets.

Answer

Stennie picture Stennie · Apr 26, 2013

Config Server Protocols

MongoDB 3.0 and earlier only support a single type of config server deployment protocol which is referred to as the legacy SCCC (Sync Cluster Connection Configuration) as of MongoDB 3.2. An SCCC deployment has either 1 config server (development only) or 3 config servers (production).

MongoDB 3.2 deprecates the SCCC protocol and supports a new deployment type: Config Servers as Replica Sets (CSRS). A CSRS deployment has the same limits as a standard replica set, which can have 1 config server (development only) or up to 50 servers (production) as at MongoDB 3.2. A minimum of 3 CSRS servers is recommended for high availability in a production deployment, but additional servers may be useful for geographically distributed deployments.

SCCC (Sync Cluster Connection Configuration)

With SCCC, the config servers are updated using a two-phase commit protocol which requires consensus from multiple servers for a transaction. You can use a single config server for testing/development purposes, but in production usage you should always have 3. A practical answer for why you cannot use only 2 (or more than 3) servers in MongoDB is that the MongoDB code base only supports 1 or 3 config servers for an SCCC configuration.

Three servers provide a stronger guarantee of consistency than two servers, and allows for maintenance activity (for example, backups) on one config server while still having two servers available for your mongos to query. More than three servers would increase the time required to commit data across all servers.

The metadata for your sharded cluster needs to be identical across all config servers, and is maintained by the MongoDB sharding implementation. The metadata includes the essential details of which shards currently hold ranges of documents (aka chunks). In a SCCC configuration, config servers are not a replica set, so if one or more config servers are offline then the config data will be read only -- otherwise there is no means for the data to propagate to the offline config servers when they are back online.

Clearly 1 config server provides no redundancy or backup. With 2 config servers, a potential failure scenario is where the servers are available but the data on the servers does not agree (for example, one of the servers had some data corruption). With 3 config servers you can improve on the previous scenario: 2/3 servers might be consistent and you could identify the odd server out.

CSRS (Config Servers as Replica Sets)

MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers, and starting in 3.2 config servers are (by default) deployed as a replica set. Replica set config servers must use the WiredTiger 3.2+ storage engine (or another storage engine that supports the new readConcern read isolation semantics). CSRS also disallows some non-default replica set configuration options (e.g. arbiterOnly, buildIndexes, and slaveDelay) that are unsuitable for the sharded cluster metadata use case.

The CSRS deployment improves consistency and availability for config servers, since MongoDB can take advantage of the standard replica set read and write protocols for sharding config data. In addition, this allows a sharded cluster to have more than 3 config servers since a replica set can have up to 50 members (as at MongoDB 3.2).

With a CSRS deployment, write availability depends on maintaining a quorum of members that can see the current primary for a replica set. For example, a 3-node replica set would require 2 available members to maintain a primary. Additional members can be added for improved fault tolerance, subject to the same election rules as a normal replica set. A readConcern of majority is used by mongos to ensure that sharded cluster metadata can only be read once it is committed to a majority of replica set members and a readPreference of nearest is used to route requests to the nearest config server.