WebSockets ping/pong, why not TCP keepalive?

Thomas picture Thomas · Apr 23, 2014 · Viewed 49k times · Source

WebSockets have the option of sending pings to the other end, where the other end is supposed to respond with a pong.

Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in response, unless it already received a Close frame. It SHOULD respond with Pong frame as soon as is practical.

TCP offers something similar in the form of keepalive:

[Y]ou send your peer a keepalive probe packet with no data in it and the ACK flag turned on. You can do this because of the TCP/IP specifications, as a sort of duplicate ACK, and the remote endpoint will have no arguments, as TCP is a stream-oriented protocol. On the other hand, you will receive a reply from the remote host (which doesn't need to support keepalive at all, just TCP/IP), with no data and the ACK set.

I would think that TCP keepalive is more efficient, because it can be handled within the kernel without the need to transfer data up to user space, parse a websocket frame, craft a response frame, and hand that back to the kernel for transmission. It's also less network traffic.

Furthermore, WebSockets are explicitly specified to always run over TCP; they're not transport-layer agnostic, so TCP keepalive is always available:

The WebSocket Protocol is an independent TCP-based protocol.

So why would one ever want to use WebSocket ping/pong instead of TCP keepalive?

Answer

Marquis of Lorne picture Marquis of Lorne · Apr 23, 2014

The problems with TCP keepalive are:

  1. It is off by default.
  2. It operates at two-hour intervals by default, instead of on-demand as the Ping/Pong protocol provides.
  3. It operates between proxies rather than end to end.
  4. As pointed out by @DavidSchwartz, it operates between TCP stacks, not between the applications.

The comparison with WebSockets ping/pong isn't meaningful. TCP keepalive is automatic, and timed, when enabled, whereas WebSocket ping/pong is executed as required by the application.