Docker Kafka w/ Python consumer

Pankaj Saboo picture Pankaj Saboo · Sep 21, 2018 · Viewed 8.7k times · Source

I am using dockerized Kafka and written one Kafka consumer program. It works perfectly when I run Kafka in docker and application at my local machine. But when I configured the local application in docker I am facing issues. The issue may be due to a topic not created until time application started.

docker-compose.yml

version: '3'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_CREATE_TOPICS: "test:1:1"
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
  parse-engine:
    build: .
    depends_on:
      - "kafka"
    command: python parse-engine.py
    ports:
     - "5000:5000"

parse-engine.py

from kafka import KafkaConsumer
import json

try:
    print('Welcome to parse engine')
    consumer = KafkaConsumer('test', bootstrap_servers='localhost:9092')
    for message in consumer:
        print(message)
except Exception as e:
    print(e)
    # Logs the error appropriately. 
    pass

Error log

kafka_1         | [2018-09-21 06:27:17,400] INFO [SocketServer brokerId=1001] Started processors for 1 acceptors (kafka.network.SocketServer)
kafka_1         | [2018-09-21 06:27:17,404] INFO Kafka version : 2.0.0 (org.apache.kafka.common.utils.AppInfoParser)
kafka_1         | [2018-09-21 06:27:17,404] INFO Kafka commitId : 3402a8361b734732 (org.apache.kafka.common.utils.AppInfoParser)
kafka_1         | [2018-09-21 06:27:17,431] INFO [KafkaServer id=1001] started (kafka.server.KafkaServer)
**parse-engine_1  | Welcome to parse engine
parse-engine_1  | NoBrokersAvailable 
parseengine_parse-engine_1 exited with code 0**
kafka_1         | creating topics: test:1:1

As I already added depends_on property in docker-compose but before starting topic application connecting so error occurred.

I read that I can possible to add the script in the docker-compose file but I am looking for some easy way.

Thanks for help

Answer

Robin Moffatt picture Robin Moffatt · Sep 21, 2018

Your problem is the networking. In your Kafka config you're setting

KAFKA_ADVERTISED_HOST_NAME: localhost

but this means that any client (including your python app) will connect to the broker, and then be told by the broker to use localhost for any connections. Since localhost from your client machine (e.g. your python container) is not where the broker is, requests will fail.

You can read more about Kafka listeners in detail here: https://rmoff.net/2018/08/02/kafka-listeners-explained/

So to fix your issue, you can do one of two things:

  1. Simply change your compose to use the internal hostname for Kafka (KAFKA_ADVERTISED_HOST_NAME: kafka). This means any clients within the docker network will be able to access it fine, but no external clients will be able to (e.g. from your host machine):

    version: '3'
    services:
    zookeeper:
        image: wurstmeister/zookeeper
        ports:
        - "2181:2181"
    kafka:
        image: wurstmeister/kafka
        ports:
        - "9092:9092"
        environment:
        KAFKA_ADVERTISED_HOST_NAME: kafka
        KAFKA_CREATE_TOPICS: "test:1:1"
        KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
        volumes:
        - /var/run/docker.sock:/var/run/docker.sock
    parse-engine:
        build: .
        depends_on:
        - "kafka"
        command: python parse-engine.py
        ports:
        - "5000:5000"
    

    Your clients would then access the broker at kafka:9092, so your python app would change to

    consumer = KafkaConsumer('test', bootstrap_servers='kafka:9092')
    
  2. Add a new listener to Kafka. This enables it to be accessed both internally and externally to the docker network. Port 29092 would be for access external to the docker network (e.g. from your host), and 9092 for internal access.

    You would still need to change your python program to access Kafka at the correct address. In this case since it's internal to the Docker network, you'd use:

    consumer = KafkaConsumer('test', bootstrap_servers='kafka:9092')
    

    Since I'm not familiar with the wurstmeister images, this docker-compose is based on the Confluent images which I do know:

    (editor has mangled my yaml, you can find it here)

    ---
    version: '2'
    services:
      zookeeper:
        image: confluentinc/cp-zookeeper:latest
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
          ZOOKEEPER_TICK_TIME: 2000
    
      kafka:
        # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
        # An important note about accessing Kafka from clients on other machines: 
        # -----------------------------------------------------------------------
        #
        # The config used here exposes port 29092 for _external_ connections to the broker
        # i.e. those from _outside_ the docker network. This could be from the host machine
        # running docker, or maybe further afield if you've got a more complicated setup. 
        # If the latter is true, you will need to change the value 'localhost' in 
        # KAFKA_ADVERTISED_LISTENERS to one that is resolvable to the docker host from those 
        # remote clients
        #
        # For connections _internal_ to the docker network, such as from other services
        # and components, use kafka:9092.
        #
        # See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
        # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
        #
        image: confluentinc/cp-kafka:latest
        depends_on:
          - zookeeper
        ports:
          - 29092:29092
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
          KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    

Disclaimer: I work for Confluent