How to test whether log compaction is working or not in Kafka?

tanmayghosh2507 picture tanmayghosh2507 · Dec 15, 2015 · Viewed 10.9k times · Source

I have made the changes in server.properties file in Kafka 0.8.1.1 i.e. added log.cleaner.enable=true and also enabled cleanup.policy=compact while creating the topic. Now when I am testing it, I pushed the following messages to the topic with following (Key, Message).

  • Offset: 1 - (123, abc);
  • Offset: 2 - (234, def);
  • Offset: 3 - (345, ghi);
  • Offset: 4 - (123, changed)

Now I pushed the 4th message with a same key as an earlier input, but changed the message. Here log compaction should come into picture. And using Kafka tool, I can see all the 4 offsets in the topic. How can I know whether log compaction is working or not? Should the earlier message be deleted, or the log compaction is working fine as the new message has been pushed. Does it have to do anything with the log.retention.hours or topic.log.retention.hours or log.retention.size configurations? What is the role of these configs in log compaction. P.S. - I have thoroughly gone through the Apache Documentation, but still it is not clear.

Answer

Jannixx picture Jannixx · Aug 11, 2016

even though this question is a few months old, I just came across it doing research for my own question. I had created a minimal example for seeing how compaction works with Java, maybe it is helpful for you too:

https://gist.github.com/anonymous/f78184eaeec3ee82b15182aec24a432a

Furthermore, consulting the documentation, I used the following configuration on a topic level for compaction to kick in as quickly as possible:

min.cleanable.dirty.ratio=0.01
cleanup.policy=compact
segment.ms=100
delete.retention.ms=100

When run, this class shows that compaction works - there is only ever one message with the same key on the topic.

With the appropriate settings, this would be reproducible on command line.