I can imagine two setups:
Load-balance then cache
+-- Cache server #1 (varnish) -- App server #1
/
Load Balancer (haproxy)-+---- Cache server #2 (varnish) -- App server #2
\
+-- Cache server #3 (varnish) -- App server #3
Cache then load-balance
+-- App server #1
/
Cache Server (varnish) --- Load Balancer (haproxy) --+---- App server #2
\
+-- App server #3
The problem with the first setup is that there are multiple caches, which wastes a lot of memory and makes invalidating cache more complicated.
The problem with the second setup is that there might be a performance hit and two single points of failure (varnish and haproxy) instead of just one (haproxy)?
I'm tempted to go with the second setup because both haproxy and varnish are supposed to be fast and stable: what's your opinion?
I built a similar setup a few years back for a busy web application (only I did it with Squid instead of Varnish), and it worked out well.
I would recommend using your first setup (HAProxy -> Varnish) with two modifications:
keepalived
and a shared virtual IPbalance uri
load balancing algorithm to optimize cache hitsPros:
Cons: