How does consumer rebalancing work in Kafka?

java_geek picture java_geek · Nov 28, 2014 · Viewed 10.1k times · Source

When a new consumer/brorker is added or goes down, Kafka triggers a rebalance operation. Is Kafka Rebalancing a blocking operation. Are Kafka consumers blocked while a rebalancing operation is in progress?

Answer

serejja picture serejja · Dec 2, 2014

Depends on what you mean by "blocked". If you mean "are existing connections closed when rebalance is triggered" then the answer is yes. The current Kafka's rebalancing algorithm is unfortunately imperfect.

Here is what is happening during consumer rebalance.

Assume we have a topic with 10 partitions (0-9), and one consumer (lets name it consumer1) consuming it. When a second consumer appears (consumer2) the rebalance task triggers for both of them (consumer1 gets an event, consumer2 does the initial rebalance). Now consumer1 closes all the existing connections (even those that will be reopened soon) and releases the partition ownership in Zookeeper for all 10 partitions.

Then it runs the partition assignment algorithm and decides what partitions should be claimed and claims the partition ownership in Zookeeper again. If the claim was successful consumer1 starts fetching his new partitions.

Meanwhile consumer2 runs the partition assignment algorithm as well and tries to claim his partitions in Zookeeper as well. Claim will succeed only when consumer1 releases the ownership on these partitions. When the claim is successful consumer2 starts fetching, or if it fails to claim partitions within a given amount of retries you get a rebalance failed after n retries exception.

As you noticed instead of just closing connections and releasing ownership for partitions consumer1 does not own anymore, it unnecessarily closes ALL his connections and restarts with just a lower amount of partitions. The same story with adding partitions (when we consume by a wildcard filter and new topic appears) - ALL connections are closed and then opened again instead of just opening new ones.

So I hope this answers your question - fetching stops when rebalance kicks in.