Error UNKNOWN_MEMBER_ID occurred while committing offsets for group xxx

Johnny Lim picture Johnny Lim · Jul 15, 2016 · Viewed 14k times · Source

With Kafka client Java library, consuming logs has worked for some time but with the following errors it doesn't work any more:

2016-07-15 19:37:54.609  INFO 4342 --- [main] o.a.k.c.c.internals.AbstractCoordinator  : Marking the coordinator 2147483647 dead.
2016-07-15 19:37:54.933 ERROR 4342 --- [main] o.a.k.c.c.internals.ConsumerCoordinator  : Error UNKNOWN_MEMBER_ID occurred while committing offsets for group logstash
2016-07-15 19:37:54.933  WARN 4342 --- [main] o.a.k.c.c.internals.ConsumerCoordinator  : Auto offset commit failed: Commit cannot be completed due to group rebalance
2016-07-15 19:37:54.941 ERROR 4342 --- [main] o.a.k.c.c.internals.ConsumerCoordinator  : Error UNKNOWN_MEMBER_ID occurred while committing offsets for group logstash
2016-07-15 19:37:54.941  WARN 4342 --- [main] o.a.k.c.c.internals.ConsumerCoordinator  : Auto offset commit failed:
2016-07-15 19:37:54.948  INFO 4342 --- [main] o.a.k.c.c.internals.AbstractCoordinator  : Attempt to join group logstash failed due to unknown member id, resetting and retrying.

It keeps resetting.

Running another instance of the same application gets errors immediately.

I suspect Kafka or its ZooKeeper has a problem but there's no error log.

Any one who has idea on what's going on here?

This is the application I'm using: https://github.com/izeye/log-redirector

Answer

Tavo picture Tavo · Aug 16, 2016

I just faced the same issue. I have been investigating, and in this thread and in this wiki you can find the solution.

The issue seems to be that the processing of a batch takes longer than the session timeout. Either increase the session timeout or the polling frequency or limit the number of bytes received.

What worked for me was changing max.partition.fetch.bytes. But you can also modify session.timeout.ms or the value you pass to your consumer.poll(TIMEOUT)