What exactly is the zookeeper quorum setting in hbase-site.xml?

raj picture raj · Dec 14, 2010 · Viewed 27.1k times · Source

What exactly is the zookeeper quorum setting in hbase-site.xml?

Answer

MrGomez picture MrGomez · Dec 14, 2010

As described in hbase-default.xml, here's the setting:

Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.

What this actually does has been answered by Edward J. Yoon here. With editing on my part, for clarity:

The Apache Zookeeper is a coordination service for distributed applications, like Google's Chubby. Many projects uses zookeeper, and we (Apache Hama) also use zookeeper for barrier synchronization of Bulk Synchronous Parallel computing framework.

Today, I surveyed more about the paxos and dynamic quorum features of the Zookeeper project, to better name the class org.apache.hama.zookeeper.QuorumPeer. Because of documentation is not enough ( http://hadoop.apache.org/zookeeper/docs/r3.0.0/api/index.html ), I didn't understand the meaning of "quorum", as this term was somewhat odd to me. But, "org.apache.hama.zookeeper.QuorumPeer" is the proper name!! xD

So, what is the Quorum and why do we need a Quorum?

According to Wikipedia, Quorum is the minimum number of members of a deliberative body necessary to conduct the business of that group. Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.

As you know, a Fault-Tolerant mechanism is one of the important functions of distributed system. The Quorum algorithm is used to prevent a split-brain condition. When split-brain condition occurs, according to the Quorum algorithm, zookeeper determines the "Primary Partition" and "Secondary Partition". Then, the servers in primary group receive and process user's request, and the servers in secondary group become read-only.

When does this system recover from a split-brain condition? When they're merged to one partition again. Internally, zookeeper uses atomic broadcast protocol instead of Paxos.

You should also read the original version, in case I mistranslated the concepts he was trying to present.

My understanding of the quorum mechanism in Apache Zookeeper is it explicitly defines a replication quorum across several pre-defined hosts. If this quorum is not met, the partitions that disagree are split off to a secondary partition until Zookeeper can reintegrate them with the primary partition.

This adds more granularity to Hadoop's eventual consistency model. HBase, meanwhile, is currently in the process of further integrating Zookeeper with its code.