Proxying WebSockets with TCP load balancer without sticky sessions

Justin Meltzer picture Justin Meltzer · Mar 7, 2013 · Viewed 15.1k times · Source

I want to proxy WebSocket connections to multiple node.js servers using Amazon Elastic Load Balancer. Since Amazon ELB does not provide actual WebSocket support, I would need to use its vanilla TCP messaging. However, I'm trying to understand how this would work without some sort of sticky session functionality.

I understand that WebSockets work by first sending an HTTP Upgrade request from the client, which is handled by the server by sending a response which correctly handles key authentication. After the server sends that response and it is approved by the client, there is a bidirectional connection between that client and server.

However let's say the client, after approving the server response, sends data to the server. If it sends the data to the load balancer, and the load balancer then relays that data to a different server that did not handle the original WebSocket Upgrade request, then how will this new server be aware of the WebSocket connection? Or will the client automatically bypass the load balancer and send data directly to the server that handled the initial upgrade?

Answer

Dr. Jan-Philip Gehrcke picture Dr. Jan-Philip Gehrcke · Mar 7, 2013

I think what we need to understand in order to answer this question is how exactly the underlying TCP connection evolves during the whole WebSocket creation process. You will realize that the sticky part of a WebSocket connection is the underlying TCP connection itself. I am not sure what you mean with "session" in the context of WebSockets.

At a high level, initiating a "WebSocket connection" requires the client to send an HTTP GET request to an HTTP server whereas the request includes the Upgrade header field. Now, for this request to happen the client needs to have established a TCP connection to the HTTP server (that might be obvious, but I think here it is important to point this out explicitly). The subsequent HTTP server response is then sent through the same TCP connection.

Note that now, after the server response has been sent, the TCP connection is still open/alive if not actively closed by either the client or the server.

Now, according to RFC 6455, the WebSocket standard, at the end of section 4.1:

If the server's response is validated as provided for above, it is
said that The WebSocket Connection is Established and that the
WebSocket Connection is in the OPEN state

I read from here that the same TCP connection that was initiated by the client before sending the initial HTTP GET (Upgrade) request will just be left open and will from now on serve as the transport layer for the full-duplex WebSocket connection. And this makes sense!

With respect to your question this means that a load balancer will only play a role before the initial HTTP GET (Upgrade) request is made, i.e. before the one and only TCP connection involved in said WebSocket connection creation is established between the two communication end points. Thereafter, the TCP connection stays established and cannot become "redirected" by a network device in between.

We can conclude that -- in your session terminology -- the TCP connection defines the session. As long as a WebSocket connection is alive (i.e. is not terminated), it by definition provides and lives in its own session. Nothing can change this session. Speaking in this picture, two independent WebSocket connections, however, cannot share the same session.

If you referred to something else with "session", then it probably is a session that is introduced by the application layer and we cannot comment on that one.

Edit with respect to your comments:

so you're saying that the load balancer is not involved in the TCP connection

No, that is not true, at least in general. It definitely can take influence upon TCP connection establishment, in the sense that it can decide what to do with the client connection attempt. The specifics depend on the exact type of load balancer (* , see below). Important: After the connection is established between two endpoints -- whereas I don't consider the load balancer to be an endpoint, I refer to WebSocket client and WebSocket server -- the two endpoints will not change anymore for the lifetime of the WebSocket connection. The load balancer might* still be in the network path, but can be assumed to not take influence anymore.

Therefore the full-duplex connection is between the client and the end server?

Yes!

***There are different types of load balancing. Depending on the type, the role of the load balancer is different after connection establishment between the two end points. Examples:

  • If the load balancing happens on DNS basis, then the load balancer is not involved in the final TCP connection at all. It just tells the client to which host is has to connect directly.
  • If the load balancer works like the Layer 4 ELB from AWS (docs here), then it so to say proxies the TCP connection. So the client would actually see the ELB itself as the server. What happens, however, is that the ELB just forwards the packages in both directions, without change. Hence, it is still heavily involved in the TCP connection, just transparently. In this case there are actually two permanent TCP connections involved: one from you to the ELB, and one from the ELB to the server. These are again permanent for the lifetime of your WebSocket connection.