I am new to Kafka, Serialization and JSON
WHat I want is the producer to send a JSON file via kafka and the consumer to consume and work with the JSON file in its original file form.
I was able to get it so JSON is converter to a string and sent via a String Serializer and then the consumer would parse the String and recreate a JSON object but I am worried that this isnt efficient or the correct method (might lose the field types for JSON)
So I looked into making a JSON serializer and setting that in my producer's configurations.
I used the JsonEncoder here : Kafka: writing custom serializer
But when I try to run my producer now, it seems that in the toBytes function of the encoder the try block is never returning anything like i want it to
try {
bytes = objectMapper.writeValueAsString(object).getBytes();
} catch (JsonProcessingException e) {
logger.error(String.format("Json processing failed for object: %s", object.getClass().getName()), e);
}
Seems objectMapper.writeValueAsString(object).getBytes()
; takes my JSON obj ({"name":"Kate","age":25}
)and converts it to nothing,
this is my producer's run function
List<KeyedMessage<String,JSONObject>> msgList=new ArrayList<KeyedMessage<String,JSONObject>>();
JSONObject record = new JSONObject();
record.put("name", "Kate");
record.put("age", 25);
msgList.add(new KeyedMessage<String, JSONObject>(topic, record));
producer.send(msgList);
What am I missing? Would my original method(convert to string and send and then rebuild the JSON obj) be okay? or just not the correct way to go?
THanks!
Hmm, why are you afraid that a serialize/deserialize step would cause data loss?
One option you have is to use the Kafka JSON serializer that's included in Confluent's Schema Registry, which is free and open source software (disclaimer: I work at Confluent). Its test suite provides a few examples to get you started, and further details are described at serializers and formatters. The benefit of this JSON serializer and the schema registry itself is that they provide transparent integration with producer and consumer clients for Kafka. Apart from JSON there's also support for Apache Avro if you need that.
IMHO this setup is one of the best options in terms of developer convenience and ease of use when talking to Kafka in JSON -- but of course YMMV!