I have a socket.io chat room running whose traffic is getting larger as we are running on one machine. We have ran benchmarks using the ws library for sockets and they do perform much better which would better utilize our hardware. This would come at a cost of having to rewrite our application though.
Our socket.io application allows users to create private chat rooms which are implemented by using namespaces. E.g
localhost:8080/room/1
localhost:8080/room/2
localhost:8080/room/3
When everything is in one instance it is quite easy, but now we are looking to expand this capacity into multiple nodes.
We run this instance in Amazon's cloud. Previously it looked like scaling websockets was an issue with ELBs. We have noticed that Amazon now supports and application load balancer which supports websockets. This sounds great but after reading the documentation I must admit I don't really know what it means. If I am using socket.io with thousands of namespaces do I just put instances behind this ALB and everything will work?My main questions is:
If x number of users join a namespace, will the ALB automatically redirect my messages to and from the proper users? So let's say I have 5 vanilla socket.io instance running behind the ALB. User 1 creates a namespace. Few hours later pass and User 99999 comes and wants to join this namespace, will there need to be any additional code written to do this or will the alb redirect everything where it should go? The same goes for sending and receiving messages?
While ALB will load balance the users correctly, you will need to adapt your code a little since users that joined a specific room will be dispersed throughout different servers.
In their documentation socket.io provides a way to do this:
Now that you have multiple Socket.IO nodes accepting connections, if you want to broadcast events to everyone (or even everyone in a certain room) you’ll need some way of passing messages between processes or computers.
The interface in charge of routing messages is what we call the Adapter. You can implement your own on top of the socket.io-adapter (by inheriting from it) or you can use the one we provide on top of Redis: socket.io-redis:
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
ALB setup
I would recommend to enable sticky session in your ALB, otherwise socket.io handshake will fail when using a non-websocket transport, such as long polling, since handshaking task using this transports requires more than one request, and you need all of those requests to be performed against the same server.
If I wanted to avoid having a redis database. For example, if my rooms are created by users, if userA creates a room on instance 4, if another user wants to join this room, how would they know which instance it is on? Would I need the adapter here too?
The goal of this alternative is to have each room assigned to a specific EC2 Instance. We're going to achieve this using ALB Routing
N rooms > 1 instance.
You will need to change your rooms URL to something like:
/i1/room/550
/i1/room/20
/i2/room/5
/i5/room/492
being:
/{instance-number}/room/{room-id}
This is needed so ALB can route each room to a specific instance.
Create N target groups (N being the number of instances you have at the moment)
Register each instance to each target group
Target Groups > Instance X target group > Target tab > Edit > Choose instance X > add to registered
Target group X > EC2 Instance X
Target group Y > EC2 Instance Y
Edit ALB target rules
Load Balancers > Your ALB > Listeners > View/Edit Rules
Create one rule per target group/instance with following settings:
/iX/room/*
instanceX
Once you have this setup when you enter:
/i1/room/550
you will be using EC2 Instance 1. /i2/room/200
will be using EC2 Instance 2and so on.
Now you will have to make your own logic in order to have the rooms balanced across your instances. You don't want to have one instance hosting almost all the groups.
I recommend the first approach since it can be autoscaled easily.