I'm doing some threaded asynchronous networking experiment in python, using UDP.
I'd like to understand polling and the select python module, I've never used them in C/C++.
What are those for ? I kind of understand a little select, but does it block while watching a resource ? What is the purpose of polling ?
Okay, one question a time.
Here is a simple socket server skeleton:
s_sock = socket.socket()
s_sock.bind()
s_sock.listen()
while True:
c_sock, c_addr = s_sock.accept()
process_client_sock(c_sock, c_addr)
Server will loop and accept connection from a client, then call its process function to communicate with client socket. There is a problem here: process_client_sock
might takes a long time, or even contains a loop(which is often the case).
def process_client_sock(c_sock, c_addr):
while True:
receive_or_send_data(c_sock)
In which case, the server is unable to accept any more connections.
A simple solution would be using multi-process or multi-thread, just create a new thread to deal with request, while the main loop keeps listening on new connections.
s_sock = socket.socket()
s_sock.bind()
s_sock.listen()
while True:
c_sock, c_addr = s_sock.accept()
thread = Thread(target=process_client_sock, args=(c_sock, c_addr))
thread.start()
This works of course, but not well enough considering performance. Because new process/thread takes extra CPU and memory, not idle for servers might get thousands connections.
So select
and poll
system calls tries to solve this problem. You give select
a set of file descriptors and tell it to notify you if any fd is ready to read/write/ or exception happens.
Yes, or no depends on the parameter you passed to it.
As select man page says, it will get struct timeval
parameter
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
There are three cases:
timeout.tv_sec == 0 and timeout.tv_usec = 0
No-blocking, return immediately
timeout == NULL
block forever until a file descriptor is ready.
timeout is normal
wait for certain time, if still no file descriptor is available, timeout and return.
Put it into simple words: polling frees CPU for other works when waiting for IO.
This is based on the simple facts that
Hope it helps.