High Availability @ Load balancing /DNS Layer
DNS and Load Balancing(LB) layer are the main entry points for a web application. There is no point in building sophisticated clusters, replications and heavy server farms in the Web/App and DB layer without building High Availability in the DNS/LB layer. If the LB is built with Single Point of Failure (SPOF) it will bring down the entire site during outage.
Following are some of the common architecture practices for building highly available Load Balancing Tier in AWS:
Practice 1)Use Amazon Elastic Load Balancers for the Load Balancing tier for HA. Amazon Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances. It enables you to achieve even greater fault tolerance in your applications, seamlessly providing the amount of load balancing capacity needed in response to incoming application traffic. It can handle thousands of concurrent connections easily and can elastically expand when more loads hit. ELB is an inherently fault tolerant building block and it can handle the failure in its layer automatically. When the load increases usually more EC2 Load balancer instances are automatically added in the ELB tier. This automatically avoids SPOF and overall site will be operational even if some LB EC2 instances in ELB tier have failed. Also Amazon ELB will detect the health of the underlying Web/App EC2 instances and will automatically route requests to the Healthy Web/App EC2 in case of failure. Amazon ELB can be configured with Round Robin in case the Web/App EC2 layer is Stateless and can be configured with Session Sticky for State full application layers. If the Session Synchronization is not done properly in the app layer then even Session Sticky balancing cannot save failover error pages and website failures.
Practice 2) Sometimes the application demands the need of :
- Sophisticated Load balancers with Cache feature (like Varnish)
- Load balancing Algorithms like least connection, Weighted
- Have heavy sudden load spikes
- Most of the request load generated from specific source IP range servers ( Example : Load Generated from System A IP to AWS)
- Require the Fixed IP address of the Load Balancer to registered
For all the above mentioned cases, having Amazon ELB in your Load Balancing tier will not be right choice. Hence for such scenarios we recommend usage of Software Load Balancers/Reverse Proxies like Nginx, HAProxy, Varnish, Zeus etc in the Load balancing tier configured on EC2 instances. Such architecture can pose SPOF at the load balancing tier. To avoid this, usually it is recommended to have multiple Load Balancers in the LB tier. Even if a single Load balancer fails other load balancers are still operational and will take care of the user requests. Load Balancers like Zeus comes with inbuilt HA and LB Cluster Sync capability. For most use cases, inbuilt HA between LB is not needed and can be handled efficiently combining it with DNS RR load balancing technique. Let us see about this technique in detail below, before that there are certain points to be considered while architecting a robust LB tier in AWS:
- Multiple Nginx and HAProxy can be configured for HA in AWS, they can detect the health of the underlying Web/App EC2 instances and can automatically route requests to the Healthy Web/App EC2 instances
- Nginx and HAproxy can be configured with Round Robin algorithm in case the Web/App EC2 instances are Stateless and can be configured with Session Sticky for Stateful application servers. If the Session Synchronization is not done properly in the app layer then even Session Sticky balancing cannot save error pages.
- Scale out of Load Balancers is better than Scale UP in AWS. Scale out model inherently adds new LB’s into this tier avoiding SPOF. Scale out of Load balancers like Nginx, HAProxy requires custom scripts/templates to be developed. It is not recommended to use Amazon AutoScaling for this layer.
- Load Balancers placed under the DNS needs the AWS Elastic IP to be attached. Since ElasticIP attach/Detach takes around ~120 seconds or more in some AWS regions, it is not advisable to run 2 LB’s switched between single ElasticIP model. It is recommended to run 2 or more LB EC2 instances attached individually with ElasticIP’s all running in active-active model under DNS/Route53.
- In event a Load Balancer EC2 has failed, we can detect this using CloudWatch or Nagios alerts and can manually or automatically launch another LB EC2 in few seconds ~ minutes in AWS for HA.
Now lets get a layer above LB -> The "DNS" ,Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. Route 53 effectively connects user requests to infrastructure running in Amazon Web Services (AWS) – such as an Amazon Elastic Compute Cloud (Amazon EC2) instance, an Amazon Elastic Load Balancer, or an Amazon Simple Storage Service (Amazon S3) bucket – and can also be used to route users to infrastructure outside of AWS. AWS Route 53 is AWS inherently fault tolerant building block and a Managed DNS service. Amazon Route53 can be configured using Console , API's to do DNS level Load Balancing. RR or Weighted algorithms can be configured at Route53 level and requests can be load balanced between the Multiple Load Balancer EC2 or ELB configured under the Route53. Route53 DNS RR cannot check the health of the Load balancers and direct the request; hence it relies on the browser or the web clients to do the transparent retry in case they encounter error pages.
Related Articles : AWS High Availability Patterns Series