pdeva
pdeva

Reputation: 45541

How to scale websocket connection load as one adds/removes servers?

To explain the problem:

With HTTP:

Assume there are 100 requests/second arriving.

  1. If, there are 4 servers, the load balancer (LB) can distribute the load across them evenly, 25/second per server
  2. If i add a server (5 servers total), the LB balances it more evenly to now 20/second per server
  3. If i remove a server (3 servers total), the LB decreases load per server to 33.3/second per server

So the load per server is automatically balanced as i add/remove servers, since each connection is so short lived.

With Websockets

Assume there are 100 clients, 2 servers (behind a LB)

  1. The LB initially balances each incoming connection evenly, so each server has 50 connections.
  2. However, if I add a server (3 servers total), the 3rd servers gets 0 connections, since the existing 100 clients are already connected to the 2 servers.
  3. If i remove a server (1 server total), all those 100 connections will reconnect and are now served by 1 server.

Problem

Since websocket connections are persistent, adding/removing a server does not increase/decrease load per server until the clients decide to reconnect.

How does one then efficiently scale websockets and manage load per server?

Upvotes: 8

Views: 1339

Answers (1)

winrid
winrid

Reputation: 99

This is similar to problems the gaming industry has been trying to solve for a long time. That is an area where you have many concurrent connections and you have to have fast communication between many clients.

Options:

  1. Slave/master architecture where master retains connection to slaves to monitor health, load, etc. When someone joins the session/application they ping the master and the master responds with the next server. This is kind of client side load balancing except you are using server side heuristics.

This prevents your clients from blowing up a single server. You'll have to have the client poll the master before establishing the WS connection but that is simple.

This way you can also scale out to multi master if you need to and put them behind load balancers.

If you need to send a message between servers there are many options for that (handle it yourself, queues, etc).

This is how my drawing app Pixmap for Android, which I built last year, works. Works very well too.

  1. Client side load balancing where client connects to a random host name. This is how Watch.ly works. Each host can then be its own load balancer and cluster of servers for safety. Risky but simple.

  2. Traditional load balancing - ie round robin. Hard to beat haproxy. This should be your first approach and will scale to many thousands of concurrent users. Doesn't solve the problem of redistributing load though. One way to solve that with this setup is to push an event to your clients telling them to reconnect (and have each attempt to reconnect with a random timeout so you don't kill your servers).

Upvotes: 1

Related Questions