How to reconnect to a socket gracefully

Null Reference picture Null Reference · Nov 28, 2014 · Viewed 29k times · Source

I have a following method that connects to an end point when my program starts

ChannelSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
var remoteIpAddress = IPAddress.Parse(ChannelIp);
ChannelEndPoint = new IPEndPoint(remoteIpAddress, ChannelPort);
ChannelSocket.Connect(ChannelEndPoint);

I also have a timer that is set to trigger every 60 seconds to call CheckConnectivity, that attempts to send an arbitrary byte array to the end point to make sure that the connection is still alive, and if the send fails, it will attempt to reconnect.

public bool CheckConnectivity(bool isReconnect)
{
    if (ChannelSocket != null)
    {
        var blockingState = ChannelSocket.Blocking;
        try
        {
            var tmp = new byte[] { 0 };
            ChannelSocket.Blocking = false;
            ChannelSocket.Send(tmp);
        }
        catch (SocketException e)
        {
            try
            {
                ReconnectChannel();
            }
            catch (Exception ex)
            {
                return false;
            }
        }
    }
    else
    {
        ConnectivityLog.Warn(string.Format("{0}:{1} is null!", ChannelIp, ChannelPort));
        return false;
    }

    return true;
} 

private void ReconnectChannel()
{
    try
    {
        ChannelSocket.Shutdown(SocketShutdown.Both);
        ChannelSocket.Disconnect(true);
        ChannelSocket.Close();
    }
    catch (Exception ex)
    {
        ConnectivityLog.Error(ex);
    }

    ChannelSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
    var remoteIpAddress = IPAddress.Parse(ChannelIp);
    ChannelEndPoint = new IPEndPoint(remoteIpAddress, ChannelPort);
    ChannelSocket.Connect(ChannelEndPoint);
    Thread.Sleep(1000);

    if (ChannelSocket.Connected)
    {
        ConnectivityLog.Info(string.Format("{0}:{1} is reconnected!", ChannelIp, ChannelPort));
    }
    else
    {
        ConnectivityLog.Warn(string.Format("{0}:{1} failed to reconnect!", ChannelIp, ChannelPort));
    }
}

So how I'd test the above, is to physically unplug the LAN cable from my ethernet device, allowing my code to attempt to reconnect (which fails obviously) and reconnect back the LAN cable.

However, even after reconnecting the LAN cable (able to ping), ChannelSocket.Connect(ChannelEndPoint) in my Reconnect method always throws this error

No connection could be made because the target machine actively refused it 192.168.168.160:4001

If I were to restart my whole application, it connects successfully. How can I tweak my reconnect method such that I don't have to restart my application to reconnect back to my Ethernet device?

Answer

Vikas Gupta picture Vikas Gupta · Dec 2, 2014

If an application closes a TCP/IP port, the protocol dictates that the port stays in TIME_WAIT state for a certain duration (default of 240 seconds on a windows machine). See following for references -

http://en.wikipedia.org/wiki/Transmission_Control_Protocol

http://support.microsoft.com/kb/137984

http://www.pctools.com/guides/registry/detail/878/

What this means for your scenario - is that you cannot expect to close (willingly or unwillingly) and re-open a port within a short period of time (even several seconds). Despite some registry tweaks which you'd find on internet.. the port will be un-available for any app on windows, for a minimum of 30 seconds. (Again, default is 240 seconds)

Your options - here are limited...

  1. From the documentation at http://msdn.microsoft.com/en-us/library/4xzx2d41(v=vs.110).aspx -

"If the socket has been previously disconnected, then you cannot use this (Connect) method to restore the connection. Use one of the asynchronous BeginConnect methods to reconnect. This is a limitation of the underlying provider."

The reason why documentation suggests that BeginConnect must be used is what I mentioned above.. It simply doesn't expect to be able to establish the connection right away.. and hence the only option is to make the call asynchronously, and while you wait for the connection to get established in several minutes, do expect and plan for it to fail. Essentially, likely not an ideal option.

  1. If the long wait and uncertainty is not acceptable, then your other option is to somehow negotiate a different port between the client and server. (For example, in theory you could use UDP, which is connectionless, to negotiate the new TCP port you'd re-establish the connection on). Communication using UDP, in theory of course, itself is not guaranteed by design. But should work most of the times (Today, networking in typical org is not that flaky / unreliable). Subjective to scenario / opinion, perhaps better than option 1, but more work and smaller but finite chance of not working.

  2. As suggested in one of the comments, this is where application layer protocols like http and http services have an advantage. Use them, instead of low level sockets, if you can. If acceptable, this is the best option to go with.

(PS - FYI - For HTTP, there is a lot of special handling built into OS, including windows - For example, there is a dedicated driver Http.sys, specially for dealing with multiple apps trying to listen on same port 80 etc.. The details here are a topic for another time.. point is, there is lots of goodness and hard work done for you, when it comes to HTTP)