How do I cleanly reconnect a boost::socket following a disconnect?

GrahamS picture GrahamS · Jun 17, 2010 · Viewed 24.4k times · Source

My client application uses a boost::asio::ip::tcp::socket to connect to a remote server. If the app loses connection to this server (e.g. due to the server crashing or being shutdown) I would like it to attempt a re-connect at regular intervals until it succeeds.

What do I need to do on the client-side to cleanly handle a disconnect, tidy up and then repeatedly attempt reconnects?

Currently the interesting bits of my code look something like this.

I connect like this:

bool MyClient::myconnect()
{
    bool isConnected = false;

    // Attempt connection
    socket.connect(server_endpoint, errorcode);

    if (errorcode)
    {
        cerr << "Connection failed: " << errorcode.message() << endl;
        mydisconnect();
    }
    else
    {
        isConnected = true;

        // Connected so setup async read for an incoming message.
        startReadMessage();

        // And start the io_service_thread
        io_service_thread = new boost::thread(
            boost::bind(&MyClient::runIOService, this, boost::ref(io_service)));
    }
    return (isConnected)
}

Where the runIOServer() method is just:

void MyClient::runIOService(boost::asio::io_service& io_service)
{
    size_t executedCount = io_service.run();
    cout << "io_service: " << executedCount << " handlers executed." << endl;
    io_service.reset();
}

And if any of the async read handlers return an error then they just call this disconnect method:

void MyClient::mydisconnect(void)
{
    boost::system::error_code errorcode;

    if (socket.is_open())
    {
        // Boost documentation recommends calling shutdown first
        // for "graceful" closing of socket.
        socket.shutdown(boost::asio::ip::tcp::socket::shutdown_both, errorcode);
        if (errorcode)
        {
            cerr << "socket.shutdown error: " << errorcode.message() << endl;
        }

        socket.close(errorcode);
        if (errorcode)
        {
            cerr << "socket.close error: " << errorcode.message() << endl;
        }    

        // Notify the observer we have disconnected
        myObserver->disconnected();            
    }

..which attempts to gracefully disconnect and then notifies an observer, which will start calling connect() at five second intervals until it gets reconnected.

Is there anything else I need to do?

Currently this does seem to work. If I kill the server that it is connected to I get the expected "End of file" error at my read handlers and mydisconnect() is called without any issues.

But when it then attempts to re-connect and fails I see it report "socket.shutdown error: Invalid argument". Is this just because I am attempting to shutdown a socket that has no read/writes pending on it? Or is it something more?

Answer

bjlaub picture bjlaub · Jun 19, 2010

You need to create a new boost::asio::ip::tcp::socket each time you reconnect. The easiest way to do this is probably to just allocate the socket on the heap using a boost::shared_ptr (you could probably also get away with scoped_ptr if your socket is entirely encapsulated within a class). E.g.:

bool MyClient::myconnect()
{
    bool isConnected = false;

    // Attempt connection
    // socket is of type boost::shared_ptr<boost::asio::ip::tcp::socket>
    socket.reset(new boost::asio::ip::tcp::socket(...));
    socket->connect(server_endpoint, errorcode);
    // ...
}

Then, when mydisconnect is called, you could deallocate the socket:

void MyClient::mydisconnect(void)
{
    // ...
    // deallocate socket.  will close any open descriptors
    socket.reset();
}

The error you're seeing is probably a result of the OS cleaning up the file descriptor after you've called close. When you call close and then try to connect on the same socket, you're probably trying to connect an invalid file descriptor. At this point you should see an error message starting with "Connection failed: ..." based on your logic, but you then call mydisconnect which is probably then attempting to call shutdown on an invalid file descriptor. Vicious cycle!