lost messages on zeromq pub sub

omer bach picture omer bach · Sep 19, 2011 · Viewed 15.3k times · Source

I'm trying to implement the pub sub design pattern using zeromq framework. The idea is to launch a subscriber and afterwards to launch the publisher. The subscriber will listen to 100 messages and the publisher will publish 100 messages. So far so good... However what actually happens is that even that the subscriber is already up and running when the publisher is launched , not all of the messages are received by the subscriber (a 100 messages will be picked up by the subscriber if the publisher will send at least 500 message). It seems that the first messages sent by the publisher are not sent to the subscriber.

Any ideas?

Thanks in advance, Omer.

Subscriber code (launched before the publisher)

int i=0;
zmq::context_t context (1);
zmq::socket_t subscriber (context, ZMQ_SUB);
subscriber.connect("tcp://localhost:5556");
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);

for (int update_nbr = 0; update_nbr < 100; update_nbr++) 
{        
    zmq::message_t update;
    subscriber.recv(&update);
    i++;
    std::cout<<"receiving  :"<<i<<std::endl;
}

Publisher code (launched after the subscriber)

zmq::context_t context (1);
zmq::socket_t publisher (context, ZMQ_PUB);
publisher.bind("tcp://*:5556");

int i = 0;
for (int update_nbr = 0; update_nbr < 100; update_nbr++) 
{        
    //  Send message to all subscribers
    zmq::message_t request (20);

    time_t seconds;
    seconds = time (NULL);

    char update [20]="";
    sprintf (update, "%ld", seconds);

    memcpy ((void *) request.data (), update,strlen(update));
    publisher.send(request);
    i++;
    std::cout << "sending :" << i << std::endl;

}

Answer

DNA picture DNA · Sep 22, 2011

See http://zguide.zeromq.org/page:all#Missing-Message-Problem-Solver and search for "slow joiner" on that webpage.

Basically, it takes a little time (a few milliseconds) for the connection to be set up, and in that time lots of messages can be lost. The publisher needs to sleep a little before starting to publish, or (better) it needs to explicitly synchronize with the subscriber.