Kafka consumer "failed to find leader" when fetching topic metadata

Jordan Parmer picture Jordan Parmer · Jan 13, 2016 · Viewed 10.7k times · Source

When I try to use the Kafka producer and consumer (0.9.0) script to push/pull messages from a topic, I get the errors below.

Producer Error

[2016-01-13 02:49:40,078] ERROR Error when sending message to topic test with key: null, value: 11 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)

Consumer Error

> [2016-01-13 02:47:18,620] WARN
> [console-consumer-90116_f89a0b380f19-1452653212738-9f857257-leader-finder-thread],
> Failed to find leader for Set([test,0])
> (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
> kafka.common.KafkaException: fetching topic metadata for topics
> [Set(test)] from broker
> [ArrayBuffer(BrokerEndPoint(0,192.168.99.100,9092))] failed   at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:73)    at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:94)    at
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> Caused by: java.io.EOFException   at
> org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
>   at
> kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
>   at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
>   at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:77)
>   at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
>   at kafka.producer.SyncProducer.send(SyncProducer.scala:119)     at
> kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
>   ... 3 more

Why am I getting the error, and how do I resolve it?

Configuration

Running all components in Docker containers on Mac. ZooKeeper and Kafka running in separate Docker containers.

Docker Machine (boot2docker) IP Address: 192.168.99.100 ZooKeeper Port: 2181 Kafka Port: 9092

Kafka configuration file server.properties sets the following:

host.name=localhost
broker.id=0
port=9092
advertised.host.name=192.168.99.100
advertised.port=9092

Commands

I run the following commands from within the kafka server Docker container. I've already created a topic with one partition and a replication factor of 1.

Notice the leader designation is 0 which might be part of the problem.

root@f89a0b380f19:/opt/kafka/dist# ./bin/kafka-topics.sh --zookeeper 192.168.99.100:2181 --topic test --describe
Topic:test  PartitionCount:1    ReplicationFactor:1 Configs:
    Topic: test Partition: 0    Leader: 0   Replicas: 0 Isr: 0

I then do the following to send some messages:

root@f89a0b380f19:/opt/kafka/dist# ./bin/kafka-console-producer.sh --broker-list 192.168.99.100:9092 --topic test
one message
two message
three message
four message
[2016-01-13 02:49:40,078] ERROR Error when sending message to topic test with key: null, value: 11 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
[2016-01-13 02:50:40,080] ERROR Error when sending message to topic test with key: null, value: 11 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
[2016-01-13 02:51:40,081] ERROR Error when sending message to topic test with key: null, value: 13 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
[2016-01-13 02:52:40,083] ERROR Error when sending message to topic test with key: null, value: 12 bytes with error: Failed to update metadata after 60000 ms. (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)

This is the command I'm using to attempt to consume messages which yields the consumer error I posted above.

root@f89a0b380f19:/opt/kafka/dist# ./bin/kafka-console-consumer.sh --zookeeper 192.168.99.100:2181 --topic test --from-beginning

I've confirmed ports 2181 and 9092 are open and accessible from within the Kafka Docker container:

root@f89a0b380f19:/# nc -z 192.168.99.100 2181; echo $?;
0
root@f89a0b380f19:/# nc -z 192.168.99.100 9092; echo $?;
0

Answer

Jordan Parmer picture Jordan Parmer · Jan 13, 2016

The solution wasn't what I expected at all. The error message did not line up with what was really happening.

The primary problem was mounting the log directory in Docker to my local file system. My docker run command used a volume mount to mount the Kafka log.dir folder in the container to a local directory on the host VM which was actually mounted to my Mac. It's that latter point that was the problem.

For instance,

docker run --name kafka -v /Users/<me>/kafka/logs:/var/opt/kafka:rw -p 9092:9092 -d kafka

Since I'm on a Mac and use docker-machine (e.g. boot2docker), I have to mount through my /Users/ path which boot2docker auto-mounts in the host VM. Because the underlying VM itself uses a bind mount, Kafka's I/O engine wasn't able to communicate with it correctly. If the volume mount was to a directory directly on the host Linux VM (i.e. boot2docker machine) it would work.

I can't explain the exact details since I don't know the ins-and-outs of Kafka I/O, but when I remove the mounted volume to my Mac file system it works.