How to use thousands of backends in haproxy? Is the new map feature useful for this?

rcrc picture rcrc · Feb 25, 2014 · Viewed 7.8k times · Source

I want to use haproxy as a proxy and load balancer for thousands of backends. So a request needs to be proxied to the correct backend depending on hostname and then load balanced within the backend. I am using haproxy-1.5dev21.

The config file looks like this :

frontend public
  bind :80
  mode http
  acl host1 hdr_reg(host) host1.com
  use_backend be_host1 if host1

  acl host4000 hdr_reg(host) host4000.com
  use_backend be_host4000 if host4000

backend be_host[n]
  server hostn_1
  server hostn_2

Problem is that I get an added latency of 30ms per request if there are 5000 hosts. And for 20k backends, haproxy takes long time to load, not to mention the disaster on latency per request.

Can I do something better than sequential acl rules? I have not found an example for the new map feature - the release notes say it can be used for massive redirect rules. I tried this :

use_backend %[hdr(host), map(host_to_backend_map.file)]

Something obviously stupid above in using maps, but any guidance would be helpful. Thanks!

Answer

rcrc picture rcrc · Mar 28, 2014

Some shortcomings were removed from the config file after expert input, and I list them here in case anyone else may find it useful.

  1. Use hdr(Host) instead of hdr_reg(). This vastly improves the time consumed to evaluate the ACLs. Even better, avoid acl and use the inline evaluation e.g.

    use_backend host1 if { req.fhdr(host,1) -m str host1.domain.com }

  2. Use nbproc>1. In case of concurrent connections, this helps. Though it makes it difficult to debug.

  3. For the backends, use IP address directly instead of 'server hostn_1 dns_of_server:port_number'

  4. Put 'fullconn 1000' in the defaults section. This improves the load time immensely.

Finally, use the latest haproxy git checkout and observe the enhancement in load time. It has come down considerably. Its now order of seconds compared to minutes before.

Also, regarding the 'map' feature, a new dynamic use_backend scheme is in the works that should remove the need to write as many ACLs.