What is the workaround for TCP delayed acknowledgment?

Mr. Smith picture Mr. Smith · Mar 22, 2014 · Viewed 11.9k times · Source

I have shipped an online (grid-based) videogame that uses the TCP protocol to ensure reliable communication in a server-client network topology. My game works fairly well, but suffers from higher than expected latency (similar TCP games in the genre seem to do a better job at keeping latency to a minimal).

While investigating, I discovered that the latency is only unexpectedly high for clients running Microsoft Windows (as opposed to Mac OS X clients). Furthermore, I discovered that if a Windows client sets TcpAckFrequency=1 in the registry and restarts their machine, their latency becomes normal.

It would appear that my network design did not take into account delayed acknowledgement:

A design that does not take into account the interaction of delayed acknowledgment, the Nagle algorithm, and Winsock buffering can drastically effect performance. (http://support.microsoft.com/kb/214397)

However, I'm finding it nearly impossible to take into account delayed acknowledgement in my game (or any game). According to MSDN, the Microsoft TCP stack uses the following criteria to decide when to send one ACK on received data packets:

  • If the second data packet is received before the delay timer expires (200ms), the ACK is sent.
  • If there are data to be sent in the same direction as the ACK before the second data packet is received and the delay timer expires, the ACK is piggybacked with the data segment and sent immediately.
  • When the delay timer expires (200ms), the ACK is sent.

(http://support.microsoft.com/kb/214397)

Reading this, one would presume that the workaround for delayed acknowledgement on Microsoft's TCP stack is as follows:

  1. Disable the Nagle algorithm (TCP_NODELAY).
  2. Disable the socket's send buffer (SO_SNDBUF=0), so that a call to send can be expected to send a packet.
  3. When calling send, if no further data is expected to be sent immediately, call send again with a single-byte of data that will be discarded by the receiver.

With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1 does in the registry).

However, from my testing, this improved latency only by about a half of what the registry edit does. What am I missing?


Q: Why not use UDP?

A: I chose TCP because every packet I send needs to arrive (and be in order); there are no packets that arn't worth retransmitting if they get lost (or become unordered). Only when packets can be discarded/unordered, can UDP be faster than TCP!

Answer

Mr. Smith picture Mr. Smith · Sep 16, 2014

Since Windows Vista, TCP_NODELAY option must be set prior to calling connect, or (on the server) prior to calling listen. If you set TCP_NODELAY after calling connect, it will not actually disable Nagle algorithm, yet GetSocketOption will state that Nagle has been disabled! This all appears to be undocumented, and contradicts what many tutorials/articles on the subject teach.

With Nagle actually disabled, TCP delayed acknowledgements no longer cause latency.