zeromq, C++, is it necessary to set a high water mark for subscribers?

user788171 picture user788171 · Apr 26, 2013 · Viewed 10.2k times · Source

I did a quick test of the ZeroMQ PUB/SUB and now have some working code. However, I am a bit confused about the concept of high water mark as applied in zeromq.

I have set a HWM in my publisher code which sets a queue length for each subscriber connected to the socket.

It is also possible however to set a HWM on the receiving socket of the subscriber. Is there any reason to set a HWM on the subscriber side and how would this differ from setting a publisher HWM?

Answer

Franco Rondini picture Franco Rondini · May 8, 2013

Short answer:

In the publisher we should pretty much always carefully consider HWM, because there is plenty of reasons to crash (out of memory) affecting the system overall ( since the publisher serves all the subscribers ).

Also in the subscriber, there are cases in which regulating the HWM could be usefull, but this depends mostly by the nature of the subscriber, what it does with the received message and how high is the probability that it could not be able to process in time for a big number of received message; and by the expected runtime environment ( how much memory is available, number of subscribers etc. )

More detailed answer:

ZMQ uses the concept of HWM (high-water mark) to define the capacity of it's internal pipes. Each connection out of a socket or into a socket has its own pipe, and HWM for sending, and/or receiving, depending on the socket type. Some sockets ( PUB, PUSH ) only have send buffers. Some ( SUB, PULL, REQ, REP ) only have receive buffers. Some ( DEALER, ROUTER, PAIR ) have both send and receive buffers.

The available socket option are:

  • ZMQ_SNDHWM: Set high water mark for outbound messages (... on the publisher socket )
  • ZMQ_RCVHWM: Set high water mark for inbound messages (... on the subscriber socket )

ZMQ 3.0+ forces default limits on its internal buffers (the so-called HWM) because the HWM it's a great way to reduce memory overflow problems

Both ZMQ_PUB and ZMQ_SUB have ZMQ_HWM option action set to "Drop" therefore when the limits are riched the memory of the subscriber or the publisher should stops growing up, at least by what depends by the ZMQ buffers.

Usually who need most protection against undiscriminated use of memory ( out of memory issues ) are the publishers:

Over the inproc transport, the sender and receiver share the same buffers, so the real HWM is the sum of the HWM set by both sides.

But if you’ re using TCP and a subscriber is slow, messages will queue up on the publisher.

Common failure causes of PUB-SUB includes:

  • Subscribers can fetch messages too slowly, so queues build up and then overflow.
  • Networks can become too slow, so publisher-side queues overflow and publishers crash.

by Queueing messages on the publisher gets publishers run out of memory and crash especially if there are lots of subscribers and it’ s not possible to flush to disk for performance reasons.

From the perspective of the publisher the great strategy , tha we use by properly settings ZWM, is Stop queuing new messages after a while, new messages just get rejected or dropped; it’ s what ØMQ does when the publisher sets a HWM.

ZMQ can also queue messages on the subscriber

If anyone’ s going to run out of memory and crash, it’ ll be the subscriber rather than the publisher, which is fair. This is perfect for “ peaky” streams where a subscriber can’ t keep up for a while, but can catch up when the stream slows down.

Note: the HWMs are not exact; while you may get up to 1,000 messages by default, the real buffer size may be much lower (as little as half), due to the way libzmq implements its queues.

The primary source of these assumption is the Pieter Hintjens's book "Code Connected Volume 1" available online in electronic format; it has a chapter dedicated to High-Water Marks containg furher explanations about this topic.