Socket server with epoll and threads

cemycc picture cemycc · Nov 28, 2011 · Viewed 7.5k times · Source

I am trying to create a socket server in C for a Collaborative real-time editor http://en.wikipedia.org/wiki/Collaborative_real-time_editor but I don't know what is the best server architecture for it.

At the first, I was trying to use select for the socket server but after that, I was reading about epoll and now I think that epoll is the best choice because the client will send every letter, that the user will write on textarea, to the server, so the server will have allot of data to process.

Also, I want to use threads with epoll but I don't know exactly how to use them. I want to use threads because I think is better to use 2 or all CPUs on the target machine.

My plan is

  • create 2 threads when the server start

  • first thread will analyze the new clients and prepare them for reading or sending

  • the second thread will have the job to read and send data from/to clients

The problem is that this 2 threads will use a while(1) with a epoll_wait.

My questions are, is this a good server architecture for using epoll with threads ? If not, what options I have ?

EDIT: I can't use libevent or libev or other libraries because this is a college project and I'm not allow to use external libraries.

Answer

David Brigada picture David Brigada · Dec 18, 2011

I think you're trying to over-engineer this problem. The epoll architecture in Linux was intended for situations where you have thousands of concurrent connections. In these kinds of cases, the overhead by the way the poll and select system calls are defined will be the main bottleneck in a server. The decision to use poll or select vs. epoll is based on the number of connections, not the amount of data.

For what you're doing, it seems as though the humans at your editing system would go insane after you hit a few dozen concurrent editors. Using epoll will probably make you go crazy; they play a few tricks with the API to squeeze out the extra performance, and you have to be very careful processing the information you get back from the calls.

This sort of application sounds like it would be network-I/O-bound instead of CPU-bound. I would try writing it as a single-threaded server with poll first. When you receive new text, buffer it for your clients if necessary, and then send it out when the socket accepts write calls. Use non-blocking I/O; the only call you want to block is the poll call.

If you are doing a significant amount of processing on the data after receiving it, but before sending it back out to clients, then you could benefit from multi-threading. Write the single-threaded version first, then if you are CPU-bound (check using top) and most of the CPU time is spent in the functions where you are doing data processing (check using gprof), add multithreading to do the data processing.

If you want, you can use pipes or Unix-domain sockets inside the program for communication between the different threads---in this way everything in the main thread can be event-driven and handled through poll. Alternatively, with this model, you could even use multiple processes with fork instead of multiple threads.