what happens when tcp/udp server is publishing faster than client is consuming?

Shahbaz picture Shahbaz · Jan 4, 2010 · Viewed 10.6k times · Source

I am trying to get a handle on what happens when a server publishes (over tcp, udp, etc.) faster than a client can consume the data.

Within a program I understand that if a queue sits between the producer and the consumer, it will start to get larger. If there is no queue, then the producer simply won't be able to produce anything new, until the consumer can consume (I know there may be many more variations).

I am not clear on what happens when data leaves the server (which may be a different process, machine or data center) and is sent to the client. If the client simply can't respond to the incoming data fast enough, assuming the server and the consumer are very loosely coupled, what happens to the in-flight data?

Where can I read to get details on this topic? Do I just have to read the low level details of TCP/UDP?

Thanks

Answer

Len Holgate picture Len Holgate · Jan 4, 2010

With TCP there's a TCP Window which is used for flow control. TCP only allows a certain amount of data to remain unacknowledged at a time. If a server is producing data faster than a client is consuming data then the amount of data that is unacknowledged will increase until the TCP window is 'full' at this point the sending TCP stack will wait and will not send any more data until the client acknowledges some of the data that is pending.

With UDP there's no such flow control system; it's unreliable after all. The UDP stacks on both client and server are allowed to drop datagrams if they feel like it, as are all routers between them. If you send more datagrams than the link can deliver to the client or if the link delivers more datagrams than your client code can receive then some of them will get thrown away. The server and client code will likely never know unless you have built some form of reliable protocol over basic UDP. Though actually you may find that datagrams are NOT thrown away by the network stack and that the NIC drivers simply chew up all available non-paged pool and eventually crash the system (see this blog posting for more details).

Back with TCP, how your server code deals with the TCP Window becoming full depends on whether you are using blocking I/O, non-blocking I/O or async I/O.

  • If you are using blocking I/O then your send calls will block and your server will slow down; effectively your server is now in lock step with your client. It can't send more data until the client has received the pending data.

  • If the server is using non blocking I/O then you'll likely get an error return that tells you that the call would have blocked; you can do other things but your server will need to resend the data at a later date...

  • If you're using async I/O then things may be more complex. With async I/O using I/O Completion Ports on Windows, for example, you wont notice anything different at all. Your overlapped sends will still be accepted just fine but you might notice that they are taking longer to complete. The overlapped sends are being queued on your server machine and are using memory for your overlapped buffers and probably using up 'non-paged pool' as well. If you keep issuing overlapped sends then you run the risk of exhausting non-paged pool memory or using a potentially unbounded amount of memory as I/O buffers. Therefore with async I/O and servers that COULD generate data faster than their clients can consume it you should write your own flow control code that you drive using the completions from your writes. I have written about this problem on my blog here and here and my server framework provides code which deals with it automatically for you.

As far as the data 'in flight' is concerned the TCP stacks in both peers will ensure that the data arrives as expected (i.e. in order and with nothing missing), they'll do this by resending data as and when required.