Unknown magic byte with kafka-avro-console-consumer

Martin Macak picture Martin Macak · Sep 19, 2018 · Viewed 9.3k times · Source

I have been trying to connect with kafka-avro-console-consumer from Confluent to our legacy Kafka cluster, which was deployed without Confluent Schema Registry. I provided schema explicitly using properties like:

kafka-console-consumer --bootstrap-server kafka02.internal:9092 \
    --topic test \
    --from-beginning \
    --property key.schema='{"type":"long"}' \
    --property value.schema='{"type":"long"}'

but I am getting 'Unknown magic byte!' error with org.apache.kafka.common.errors.SerializationException

Is it possible to consume Avro messages from Kafka using Confluent kafka-avro-console-consumer that were not serialized with AvroSerializer from Confluent and with Schema Registry?

Answer

Robin Moffatt picture Robin Moffatt · Sep 19, 2018

The Confluent Schema Registry serialiser/deserializer uses a wire format which includes information about the schema ID etc in the initial bytes of the message.

If your message has not been serialized using the Schema Registry serializer, then you won't be able to deserialize it with it, and will get the Unknown magic byte! error.

So you'll need to write a consumer that pulls the messages, does the deserialization using your Avro avsc schemas, and then assuming you want to preserve the data, re-serialize it using the Schema Registry serializer

Edit: I wrote an article recently that explains this whole thing in more depth: https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained