I have an application for downloading specific web-content, from a stream of URL's generated from 1 Kafka-producer. I've created a topic with 5 partitions and there are 5 kafka-consumers. However the timeout for the webpage download is 60 seconds. While one of the url is getting downloaded, the server assumes that the message is lost and resends the data to different consumers.
I've tried everything mentioned in
Kafka consumer configuration / performance issues
and
https://github.com/spring-projects/spring-kafka/issues/202
But I keep getting different errors everytime.
Is it possible to tie a specific consumer with a partition in kafka? I am using kafka-python for my application
I missed on the documentation of Kafka-python. We can use TopicPartition class to assign a specific consumer with one partition.
http://kafka-python.readthedocs.io/en/master/
>>> # manually assign the partition list for the consumer
>>> from kafka import TopicPartition
>>> consumer = KafkaConsumer(bootstrap_servers='localhost:1234')
>>> consumer.assign([TopicPartition('foobar', 2)])
>>> msg = next(consumer)