How to write a scalable Tcp/Ip based server

Erik Funkenbusch picture Erik Funkenbusch · May 15, 2009 · Viewed 67.5k times · Source

I am in the design phase of writing a new Windows Service application that accepts TCP/IP connections for long running connections (i.e. this is not like HTTP where there are many short connections, but rather a client connects and stays connected for hours or days or even weeks).

I'm looking for ideas for the best way to design the network architecture. I'm going to need to start at least one thread for the service. I am considering using the Asynch API (BeginRecieve, etc..) since I don't know how many clients I will have connected at any given time (possibly hundreds). I definitely do not want to start a thread for each connection.

Data will primarily flow out to the clients from my server, but there will be some commands sent from the clients on occasion. This is primarily a monitoring applicaiton in which my server sends status data periodically to the clients.

Any suggestions on the best way to make this as scalable as possible? Basic workflow? Thanks.

EDIT: To be clear, i'm looking for .net based solutions (C# if possible, but any .net language will work)

BOUNTY NOTE: To be awarded the bounty, I expect more than a simple answer. I would need a working example of a solution, either as a pointer to something I could download or a short example in-line. And it must be .net and Windows based (any .net language is acceptable)

EDIT: I want to thank everyone that gave good answers. Unfortunately, I could only accept one, and I chose to accept the more well known Begin/End method. Esac's solution may well be better, but it's still new enough that I don't know for sure how it will work out.

I have upvoted all the answers I thought were good, I wish I could do more for you guys. Thanks again.

Answer

Kevin Nisbet picture Kevin Nisbet · May 18, 2009

I've written something similar to this in the past. From my research years ago showed that writing your own socket implementation was the best bet, using the Asynchronous sockets. This meant that clients not really doing anything actually required relatively little resources. Anything that does occur is handled by the .net thread pool.

I wrote it as a class that manages all connections for the servers.

I simply used a list to hold all the client connections, but if you need faster lookups for larger lists, you can write it however you want.

private List<xConnection> _sockets;

Also you need the socket actually listenning for incomming connections.

private System.Net.Sockets.Socket _serverSocket;

The start method actually starts the server socket and begins listening for any incomming connections.

public bool Start()
{
  System.Net.IPHostEntry localhost = System.Net.Dns.GetHostEntry(System.Net.Dns.GetHostName());
  System.Net.IPEndPoint serverEndPoint;
  try
  {
     serverEndPoint = new System.Net.IPEndPoint(localhost.AddressList[0], _port);
  }
  catch (System.ArgumentOutOfRangeException e)
  {
    throw new ArgumentOutOfRangeException("Port number entered would seem to be invalid, should be between 1024 and 65000", e);
  }
  try
  {
    _serverSocket = new System.Net.Sockets.Socket(serverEndPoint.Address.AddressFamily, SocketType.Stream, ProtocolType.Tcp);
   }
   catch (System.Net.Sockets.SocketException e)
   {
      throw new ApplicationException("Could not create socket, check to make sure not duplicating port", e);
    }
    try
    {
      _serverSocket.Bind(serverEndPoint);
      _serverSocket.Listen(_backlog);
    }
    catch (Exception e)
    {
       throw new ApplicationException("Error occured while binding socket, check inner exception", e);
    }
    try
    {
       //warning, only call this once, this is a bug in .net 2.0 that breaks if 
       // you're running multiple asynch accepts, this bug may be fixed, but
       // it was a major pain in the ass previously, so make sure there is only one
       //BeginAccept running
       _serverSocket.BeginAccept(new AsyncCallback(acceptCallback), _serverSocket);
    }
    catch (Exception e)
    {
       throw new ApplicationException("Error occured starting listeners, check inner exception", e);
    }
    return true;
 }

I'd just like to note the exception handling code looks bad, but the reason for it is I had exception suppression code in there so that any exceptions would be suppressed and return false if a config option was set, but I wanted to remove it for brevity sake.

The _serverSocket.BeginAccept(new AsyncCallback(acceptCallback)), _serverSocket) above essentially sets our server socket to call the acceptCallback method whenever a user connects. This method runs from the .Net threadpool, which automatically handles creating additional worker threads if you have many blocking operations. This should optimally handle any load on the server.

    private void acceptCallback(IAsyncResult result)
    {
       xConnection conn = new xConnection();
       try
       {
         //Finish accepting the connection
         System.Net.Sockets.Socket s = (System.Net.Sockets.Socket)result.AsyncState;
         conn = new xConnection();
         conn.socket = s.EndAccept(result);
         conn.buffer = new byte[_bufferSize];
         lock (_sockets)
         {
           _sockets.Add(conn);
         }
         //Queue recieving of data from the connection
         conn.socket.BeginReceive(conn.buffer, 0, conn.buffer.Length, SocketFlags.None, new AsyncCallback(ReceiveCallback), conn);
         //Queue the accept of the next incomming connection
         _serverSocket.BeginAccept(new AsyncCallback(acceptCallback), _serverSocket);
       }
       catch (SocketException e)
       {
         if (conn.socket != null)
         {
           conn.socket.Close();
           lock (_sockets)
           {
             _sockets.Remove(conn);
           }
         }
         //Queue the next accept, think this should be here, stop attacks based on killing the waiting listeners
         _serverSocket.BeginAccept(new AsyncCallback(acceptCallback), _serverSocket);
       }
       catch (Exception e)
       {
         if (conn.socket != null)
         {
           conn.socket.Close();
           lock (_sockets)
           {
             _sockets.Remove(conn);
           }
         }
         //Queue the next accept, think this should be here, stop attacks based on killing the waiting listeners
         _serverSocket.BeginAccept(new AsyncCallback(acceptCallback), _serverSocket);
       }
     }

The above code essentially just finished accepting the connection that comes in, queues BeginReceive which is a callback that will run when the client sends data, and then queues the next acceptCallback which will accept the next client connection that comes in.

The BeginReceive method call is what tells the socket what to do when it receives data from the client. For BeginReceive, you need to give it a byte array, which is where it will copy the data when the client sends data. The ReceiveCallback method will get called, which is how we handle receiving data.

private void ReceiveCallback(IAsyncResult result)
{
  //get our connection from the callback
  xConnection conn = (xConnection)result.AsyncState;
  //catch any errors, we'd better not have any
  try
  {
    //Grab our buffer and count the number of bytes receives
    int bytesRead = conn.socket.EndReceive(result);
    //make sure we've read something, if we haven't it supposadly means that the client disconnected
    if (bytesRead > 0)
    {
      //put whatever you want to do when you receive data here

      //Queue the next receive
      conn.socket.BeginReceive(conn.buffer, 0, conn.buffer.Length, SocketFlags.None, new AsyncCallback(ReceiveCallback), conn);
     }
     else
     {
       //Callback run but no data, close the connection
       //supposadly means a disconnect
       //and we still have to close the socket, even though we throw the event later
       conn.socket.Close();
       lock (_sockets)
       {
         _sockets.Remove(conn);
       }
     }
   }
   catch (SocketException e)
   {
     //Something went terribly wrong
     //which shouldn't have happened
     if (conn.socket != null)
     {
       conn.socket.Close();
       lock (_sockets)
       {
         _sockets.Remove(conn);
       }
     }
   }
 }

EDIT: In this pattern I forgot to mention that in this area of code:

//put whatever you want to do when you receive data here

//Queue the next receive
conn.socket.BeginReceive(conn.buffer, 0, conn.buffer.Length, SocketFlags.None, new AsyncCallback(ReceiveCallback), conn);

What I would generally do is in the whatever you want code, is do reassembly of packets into messages, and then create them as jobs on the thread pool. This way the BeginReceive of the next block from the client isn't delayed while whatever message processing code is running.

The accept callback finishes reading the data socket by calling end receive. This fills the buffer provided in the begin receive function. Once you do whatever you want where I left the comment, we call the next BeginReceive method which will run the callback again if the client sends any more data. Now here's the really tricky part, when the client sends data, your receive callback might only be called with part of the message. Reassembly can become very very complicated. I used my own method and created a sort of proprietary protocol to do this. I left it out, but if you request, I can add it in. This handler was actually the most complicated piece of code I had ever written.

public bool Send(byte[] message, xConnection conn)
{
  if (conn != null && conn.socket.Connected)
  {
    lock (conn.socket)
    {
    //we use a blocking mode send, no async on the outgoing
    //since this is primarily a multithreaded application, shouldn't cause problems to send in blocking mode
       conn.socket.Send(bytes, bytes.Length, SocketFlags.None);
     }
   }
   else
     return false;
   return true;
 }

The above send method actually uses a synchronous Send call, for me that was fine due to the message sizes and the multithreaded nature of my application. If you want to send to every client, you simply need to loop through the _sockets List.

The xConnection class you see referenced above is basically a simple wrapper for a socket to include the byte buffer, and in my implementation some extras.

public class xConnection : xBase
{
  public byte[] buffer;
  public System.Net.Sockets.Socket socket;
}

Also for reference here are the usings I include since I always get annoyed when they aren't included.

using System.Net.Sockets;

I hope that's helpful, it may not be the cleanest code, but it works. There are also some nuances to the code which you should be weary about changing. For one, only have a single BeginAccept called at any one time. There used to be a very annoying .net bug around this, which was years ago so I don't recall the details.

Also, in the ReceiveCallback code, we process anything received from the socket before we queue the next receive. This means that for a single socket, we're only actually ever in ReceiveCallback once at any point in time, and we don't need to use thread synchronization. However, if you reorder this to call the next receive immediately after pulling the data, which might be a little faster, you will need to make sure you properly synchronize the threads.

Also, I hacked out alot of my code, but left the essence of what's happening in place. This should be a good start for you're design. Leave a comment if you have any more questions around this.