Would this be a good example of a scalable HttpListener that is multi-threaded?
Is this how for example a real IIS would do it?
public class Program
{
private static readonly HttpListener Listener = new HttpListener();
public static void Main()
{
Listener.Prefixes.Add("http://+:80/");
Listener.Start();
Listen();
Console.WriteLine("Listening...");
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
private static async void Listen()
{
while (true)
{
var context = await Listener.GetContextAsync();
Console.WriteLine("Client connected");
Task.Factory.StartNew(() => ProcessRequest(context));
}
Listener.Close();
}
private static void ProcessRequest(HttpListenerContext context)
{
System.Threading.Thread.Sleep(10*1000);
Console.WriteLine("Response");
}
}
I'm specifically looking for a scalable solution that DOES NOT rely on IIS. Instead only on http.sys (which is the httplistener class) -- The reason for not relying on iIS is because the govt. area I work in requires extremely reduced surface area of attack.
I've done something similar at https://github.com/JamesDunne/Aardwolf and have done some extensive testing on this.
See the code at https://github.com/JamesDunne/aardwolf/blob/master/Aardwolf/HttpAsyncHost.cs#L107 for the core event loop's implementation.
I find that using a Semaphore
to control how many concurrent GetContextAsync
requests are active is the best approach. Essentially, the main loop continues running until the semaphore blocks the thread due to the count being reached. Then there will be N concurrent "connection accepts" active. Each time a connection is accepted, the semaphore is released and a new request can take its place.
The semaphore's initial and max count values require some fine tuning, depending on the load you expect to receive. It's a delicate balancing act between the number of concurrent connections you expect vs. the average response times that your clients desire. Higher values mean more connections can be maintained yet at a much slower average response time; fewer connections will be rejected. Lower values mean less connections can be maintained yet at a much faster average response time; more connections will be rejected.
I've found, experimentally (on my hardware), that values around 128
allow the server to handle large amounts of concurrent connections (up to 1,024) at acceptable response times. Test using your own hardware and tune your parameters accordingly.
I've also found that a single instance of WCAT does not like to handle more than 1,024 connections itself. So if you're serious about load-testing, use multiple client machines with WCAT against your server and be sure to test over a fast network e.g. 10 GbE and that your OS's limits are not slowing you down. Be sure to test on Windows Server SKUs because the Desktop SKUs are limited by default.
Summary: How you write your connection accept loop is critical to the scalability of your server.