Hi David,
Post by Willy TarreauHi David,
Post by DavidHi,
Let's say I have an architecture where a couple of servers are put
behind a haproxy instance for load balancing, to serve content at
www.example.com. For reliability/availability purpose, I want to have
more than one haproxy instance to ensure continuous servicing when one
LB1 LB2
| |
------------------------------------
| | |
Server1 Server2 Server3
The issue is how to "distribute" the load across both load balancers. I
- DNS Round Robin: www.example.com is resolved to both LB1 and LB2's
IP. If e.g. LB1 crashes, clients will then look at the next entry, LB2
in this case
- High Availability IP (heartbeat, keepalive) between both load
balancers. Only one load balancer is proxying all the requests at a time
(I assume one load balancer has enough power to serve all our traffic).
Another couple of options are (since I've just researched this very
thing not more than 5 months ago):
(3) ClusterIP: is a multicast MAC address clustering method for a single
IP address. The down side to this is that all traffic is sent to _all_
nodes. This must be used in conjunction with one of the cluster
controller's below (Corosync + Pacemaker or Heartbeat 3.x + Pacemaker)
since this is implemented in iptables and needs 'controlling software'
to tell the node which parts of traffic to listen to.
http://security.maruhn.com/iptables-tutorial/x8906.html
(4) Common Address Redundancy Protocol (CARP): Protocol originally
developed under BSD. Often used in Firewalls. Alternative to Cisco's
proprietary HSRP protocol. Linux has a user-space implementation (uCARP)
http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol
(5) RedHat Cluster Suite: Similar to Corosync + Pacemaker but was
developed separately - moving slowly towards using Pacemaker as a
resource controller. In RHEL6, Pacemaker can be used as the resource
controller.
(6) Corosync + Pacemaker or Heartbeat 3.x + Pacemaker (these are the
newer cluster control software) - These are extremely good if you have
multiple things to control as part of a cluster. Heartbeat 2.x was
split into Heartbeat 3 + Pacemaker to streamline development. Corosync
is the 'cluster membership' and is required if you are using shared file
systems, Pacemaker is a generic resource controller.
The best solution would depend upon your requirements but an optimal one
(with no requirement for a server to be dormant until failure - i.e.
pure HA) is (for 2 node load balancer):
2 IPs registered in DNS-RR
2 virtual IPs which are those listed in DNS-RR.
HAproxy running on both nodes
Cluster controlling software (RHCS or Corosync + Pacemaker or Heartbeat
+ Pacemaker) managing virtual IPs and HA proxy on both nodes. According
to current advice from linux-cluster and pacemaker mailing lists - stay
away from Heartbeat 2.x - I know this is supplied by most distributions
but it is buggy and doesn't handle certain corner cases well.
Hope this helps.
Post by Willy TarreauPost by DavidI have been asked to implement the DNS RR method, but from what I have
read, method 2 is the one most commonly used. What are the pros/cons of
each ?
The first one is just pure theory. You may want to test it by yourself
to conclude that it simply does not work at all. Most clients will see
a connection error or timeout, and few of them will be able to perform
a retry on the other address but after some delay which will cause some
unpleasant experience. Also, most often the browser does not perform a
new lookup if the first one has already worked. That means that until
the browser is closed, the visitor will remain bound to the same IP.
Then you might think that it's enough to update the DNS entries upon
failure, but that does not propagate quickly, as there are caches
everywhere. To give you an idea, the haproxy ML and site were moved to
a new server one month ago, and we're still receiving a few requests a
minute on the old server. In general you can count on 1-5% of the visitors
to cache an entry more than one week. This is not a problem for a disaster
recovery, but it certainly is for a server failover because that means you
cannot put it offline at all.
High availability has the big advantage of always exposing a working
service for the same IP address, so it's a *lot* more reliable and
transparent to users. There are two common ways to provide HA under
- heartbeat
- keepalived
The first one is more suited to data servers, as it ensures that no more
than one node is up at a time. This is critical when you share file systems.
The second one is more suited to stateless servers such as proxies and load
balancers, as it ensures that no less than one node is up at a time. Sadly
people generally confuse them and sometimes use keepalived for NFS servers
or use heartbeat with haproxy...
High availability presents a small inconvenient though : the backup node
is never used so you don't really know if it works well, and there is a big
temptation not to update it as often as the master node. This is also an
advantage in that it allows you to validate your new configs on it before
loading them on the master node. If you want to use both LBs at the same
time, the solution is to have two crossed VIPs on your LBs and use DNS RR
to ensure that both are used. When one LB fails, the VIP moves to the other
one.
- DNS = load balancing, no availability at all
- HA = availability, no load balancing at all.
=> use DNS to announce always available IP addresses
Cheers,
Willy
--
Best Regards,
Brett Delle Grazie