I work on a bench of my Kafka cluster in version 1.0.0-cp1.
In part of my bench who focus on the max throughput possible with ordering guarantee and no data loss (a topic with only one partition), need I to set the max.in.flight.requests.per.connection
property to 1
?
I've read this article
And I understand that I only have to set the max.in.flight to 1 if I enable the retry feature at my producer with the retries
property.
Another way to ask my question: Only one partition + retries=0 (producer props) is sufficient to guarantee the ordering in Kafka?
I need to know because increase the max.in.flight increases drastically the throughput.
Your use case is slightly unclear. You mention ordering and no data loss but don't specify if you tolerate duplicate messages. So it's unclean if you want At least Once (QoS 1) or Exactly Once
Either way, as you're using 1.0.0 and only using a single partition, you should have a look at the Idempotent Producer instead of tweaking the Producer configs. It allows to properly and efficiently guarantee ordering and no data loss.
From the documentation:
Idempotent delivery ensures that messages are delivered exactly once to a particular topic partition during the lifetime of a single producer.
The early Idempotent Producer was forcing max.in.flight.requests.per.connection
to 1 (for the same reasons you mentioned) but in the latest releases it can now be used with max.in.flight.requests.per.connection
set to up to 5 and still keep its guarantees.
Using the Idempotent Producer you'll not only get stronger delivery semantics (Exactly Once instead of At least Once) but it might even perform better!
Back to your question
Yes without the idempotent (or transactional) producer, if you want to avoid data loss (QoS 1) and preserve ordering, you have to set max.in.flight.requests.per.connection
to 1, allow retries
and use acks=all
. As you saw this comes at a significant performance cost.