Kafka Serializer JSON

user3780651 picture user3780651 · Oct 5, 2015 · Viewed 19.2k times · Source

I am new to Kafka, Serialization and JSON

WHat I want is the producer to send a JSON file via kafka and the consumer to consume and work with the JSON file in its original file form.

I was able to get it so JSON is converter to a string and sent via a String Serializer and then the consumer would parse the String and recreate a JSON object but I am worried that this isnt efficient or the correct method (might lose the field types for JSON)

So I looked into making a JSON serializer and setting that in my producer's configurations.

I used the JsonEncoder here : Kafka: writing custom serializer

But when I try to run my producer now, it seems that in the toBytes function of the encoder the try block is never returning anything like i want it to

try {
            bytes = objectMapper.writeValueAsString(object).getBytes();

        } catch (JsonProcessingException e) {
            logger.error(String.format("Json processing failed for object: %s", object.getClass().getName()), e);
        }

Seems objectMapper.writeValueAsString(object).getBytes(); takes my JSON obj ({"name":"Kate","age":25})and converts it to nothing,

this is my producer's run function

List<KeyedMessage<String,JSONObject>> msgList=new ArrayList<KeyedMessage<String,JSONObject>>();   

    JSONObject record = new JSONObject();

    record.put("name", "Kate");
    record.put("age", 25);

    msgList.add(new KeyedMessage<String, JSONObject>(topic, record));

    producer.send(msgList);

What am I missing? Would my original method(convert to string and send and then rebuild the JSON obj) be okay? or just not the correct way to go?

THanks!

Answer

Michael G. Noll picture Michael G. Noll · Oct 6, 2015

Hmm, why are you afraid that a serialize/deserialize step would cause data loss?

One option you have is to use the Kafka JSON serializer that's included in Confluent's Schema Registry, which is free and open source software (disclaimer: I work at Confluent). Its test suite provides a few examples to get you started, and further details are described at serializers and formatters. The benefit of this JSON serializer and the schema registry itself is that they provide transparent integration with producer and consumer clients for Kafka. Apart from JSON there's also support for Apache Avro if you need that.

IMHO this setup is one of the best options in terms of developer convenience and ease of use when talking to Kafka in JSON -- but of course YMMV!