What's the difference between the following ways of enabling compression in kafka:
Approach 1: Create a topic using the command:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --config compression.type=gzip --topic test
Approach 2: Set the property compression.type = gzip in Kafka Producer Client API.
I get better compression and higher throughput when using Approach 1.
If I use Approach 1, does it mean that the compression occurs at the broker end while in Approach 2, the messages are compressed at Producer end and then sent to broker?
If I use Approach 1, does it mean that the compression occurs at the broker end?
It depends. If the producer does not set a compression.type
or sets a different one, then the message will be compressed at the broker end. However, if producer also sets compression.type
to gzip
, no need to compress again at the broker end. Actually, there are other strict conditions that must be met to ensure no need to compress, although it's a little bit beyond of the scope.
in Approach 2, the messages are compressed at Producer end and then sent to broker?
Yes, records will be compressed before being sent to the broker if producer sets its compression.type config.