Kafka set compression type at producer vs topic

shants picture shants · Feb 7, 2018 · Viewed 16.2k times · Source

What's the difference between the following ways of enabling compression in kafka:

Approach 1: Create a topic using the command:

bin/kafka-topics.sh --create --zookeeper localhost:2181 --config compression.type=gzip --topic test

Approach 2: Set the property compression.type = gzip in Kafka Producer Client API.

I get better compression and higher throughput when using Approach 1.

If I use Approach 1, does it mean that the compression occurs at the broker end while in Approach 2, the messages are compressed at Producer end and then sent to broker?

Answer

amethystic picture amethystic · Feb 8, 2018

If I use Approach 1, does it mean that the compression occurs at the broker end?

It depends. If the producer does not set a compression.type or sets a different one, then the message will be compressed at the broker end. However, if producer also sets compression.type to gzip, no need to compress again at the broker end. Actually, there are other strict conditions that must be met to ensure no need to compress, although it's a little bit beyond of the scope.

in Approach 2, the messages are compressed at Producer end and then sent to broker?

Yes, records will be compressed before being sent to the broker if producer sets its compression.type config.