Friday, November 16, 2012

Comparison Analysis:Amazon ELB vs HAProxy EC2

In this article i have analysed Amazon Elastic Load Balancer (ELB) and HAProxy (popular LB in AWS infra) in the following production scenario aspects and fitment:

Algorithms: In terms of algorithms ELB provides Round Robin and Session Sticky algorithms based on EC2 instance health status. HAProxy provides variety of algorithms like Round Robin, Static-RR, Least connection, source, uri, url_param etc. For most of the production cases use Round Robin and Session Sticky is more than enough, But in case you require algorithms like least connection you might have to lean towards HAProxy currently. In future AWS might add this algorithm in their Load Balancer

Spikey or Flash Traffic: Amazon ELB is designed to handle unlimited concurrent requests per second with “gradually increasing” load pattern.  It is not designed to handle heavy sudden spike of load or flash traffic. For example: Imagine an e-commerce website whose traffic increases gradually to thousands of concurrent requests/sec in hours, Whereas imagine use cases like Mass Online Exam or GILT load pattern or 3-Hrs Sales/launch campaign sites expecting 20K+ concurrent requests/sec spike suddenly in few minutes, Amazon ELB will struggle to handle this load volatility pattern. If this sudden spike pattern is not a frequent occurrence then we can Pre-warm ELB else we need to look for alternative Load balancers like HAProxy in AWS infrastructure. If you expect a sudden surge of traffic you can provision X number of HAProxy EC2 instances in running state.

Gradually Increasing Traffic: Both Amazon ELB and HAProxy can handle gradually increasing traffic. But when your needs become elastic and traffic increases in a day, you either need to automate or manually add new HAProxy EC2 instances when the threshold is breached. Also when the load decreases you may need to manually remove the HAProxy EC2 instances from Load Balancing Tier. If you want to avoid these manual efforts you may need to engineer using automation scripts and programs.  Amazon has intelligently automated this elastic problem in their ELB Tier. We just need to configure and use this, that's all.

Protocols : Currently Amazon ELB only supports following protocols: HTTP, HTTPS (Secure HTTP), SSL (Secure TCP) and TCP protocols. ELB supports load balancing for the following TCP ports: 25, 80, 443, and 1024-65535. In case RTMP or HTTP Streaming protocol is needed, we need to use Amazon CloudFront CDN in your architecture. HAProxy can support both TCP and HTTP protocols. In case HAProxy EC2 instance is working in pure TCP mode. A full-duplex connection will be established between clients and servers, and no layer 7 examination will be performed. This is the default mode. It can be used for SSL, SSH, SMTP etc. Current 1.4 version of HAProxy does not support HTTPS protocol natively, you may need to use Stunnel or Stud or Nginx before HAProxy to do the SSL termination. HAProxy 1.5 dev-12 comes with SSL support, it will become production ready soon. 

Timeouts: Amazon ELB currently timeouts persistent socket connections @ 60 seconds if it is kept idle. This condition will be a problem for use cases which generates large files (PDF, reports etc) at backend EC2, sends them as response back and keeps connection idle during entire generation process. To avoid this you'll have to send something on the socket every 40 or so seconds to keep the connection active in Amazon ELB. In HAProxy you can configure very large socket timeout values to avoid this problem. 

White listing IP's :Some Enterprises might want to white list 3rd party Load Balancer IP range in their firewalls . If the 3rd party service is hosted using Amazon ELB it will become a problem. Currently Amazon ELB does not provide fixed or permanent IP address for the Load balancing instances that are launched in its tier. This will be a bottleneck for enterprises which have compulsion to white list the Load balancer IP’s in external firewalls/gateways. For such use cases, currently we can use HAProxy EC2 attached with Elastic IPs as load balancers in AWS infrastructure and white list the Elastic IP's.

Amazon VPC/ Non VPC : VPC- Virtual Private Cloud. Both Amazon ELB and HAProxy EC2 can work inside the VPC and Non VPC environments of AWS.

Internal Load Balancing: Both Amazon ELB and HAProxy can be used for internal load balancing inside VPC. You might provide a service that is consumed internally by the other applications which needs load balancing. ELB and HAProxy can fit in the same. In case internal Load balancing is required in Amazon Non-VPC environments, ELB is not capable currently and HAProxy can be deployed. 

URI/URL based Load balancing: Amazon ELB cannot Load Balance based on URL patterns like other Reverse proxies. Example Amazon ELB cannot direct and load balance between request URLs and Currently for such use cases you can use HAProxy on EC2.

Sticky problem: This point comes as a surprise to many users using Amazon ELB. Amazon ELB behaves little strange when incoming traffic is originated from Single or  Specific IP ranges, it does not efficiently do round robin and sticks the request to some EC2's only. Since i do not know the ELB internals i assume ELB might be using "Source" algorithm as default for such conditions. No such cases were observed with HAProxy EC2 in AWS unless the balance algorithm is "Source". In HAProxy you can combine "Source" and "Round Robin" efficiently. In case the HTTP request does not have cookie it uses source algorithm, but if the HTTP request has a cookie HAProxy automatically shifts to RR or Weighted. (I will have to check this with AWS team)

Logging: Amazon ELB currently does not provide access to its log files for analysis. We can only monitor some essential metrics using CloudWatch for ELB. We cannot debug load balancing problems, analyze the traffic and access patterns; categorize bots / visitors etc currently because we do not have access to the ELB logs.This will also be a bottleneck for some organizations which has strong audit/compliance requirements to be met at all layers of their infrastructure. In case very strict/specific log requirements are needed, You might need to use HAProxy on EC2, in case it suffices the need. 

Monitoring: Amazon ELB can be monitored using Amazon CloudWatch. Refer this URL for ELB metrics that can be currently monitored: CloudWatch+ELB is detailed for most use cases and provides consolidated result of the entire ELB tier in console/API. On the other hand HAProxy provides user interface and stats for monitoring its instances. But if you have farms(20+) of HAProxy EC2 instances it becomes complex to manage this monitoring part efficiently. You can use tools like ServerDensity to monitor such HAProxy farms, but it has huge dependency on NAT instances availability for inside Amazon VPC deployments.

SSL Termination and Compliance requirements:
SSL Termination can be done at 2 levels using Amazon ELB in your application architecture .They are
  • SSL termination can be done at Amazon ELB Tier, which means connection is encrypted between Client(browser etc) and Amazon ELB, but connection between ELB and Web/App EC2 is clear. This configuration may not be acceptable in strictly secure environments and will not pass through compliance requirements.
  • SSL termination can be done at Backend with End to End encryption, which means connection is encrypted between Client and Amazon ELB, and connection between ELB and Web/App EC2 backed is also encrypted. This is the recommended ELB configuration for meeting the compliance requirements at LB level. 
HAProxy 1.4 does not support SSL termination directly and it has to be done in Stunnel or Stud or Nginx layer before HAProxy. HAProxy 1.5 dev-12 comes with SSL support, it will become production ready soon, i have not yet analyzed/tested the backend encryption support in this version.

Scalability and Elasticity : Most important architectural requirements of web scale systems are scalability and elasticity. Amazon ELB is designed for this and handle these requirements with ease.Elastic Load Balancer does not cap the number of connections that it can attempt to establish with the load balanced Amazon EC2 instances.Amazon ELB is designed to handle unlimited concurrent requests per second. ELB is inherently scalable and it can elastically increase /decrease its capacity depending upon the traffic. According to a benchmark done by RightScale, Amazon ELB was easily able to scale out and handle 20K+ or more concurrent requests /sec. Refer URL:
Note: The load testing was stopped after 20K req/sec by RightScale because ELB kept expanding its capacity. Considerable of DevOps engineering is needed to automate this functionality with HAProxy.

High Availability: Amazon ELB is inherently fault tolerant and a Highly available service. Since it is a managed service, Unhealthy load balancer instances are automatically replaced in ELB tier. In case of HAProxy, you need to do this work yourself and build HA on your own. Refer URL to understand more about High Availability @ Load Balancing Layer using HAProxy.

Integration with Other services: Amazon ELB can be configured with work seamlessly with Amazon AutoScaling, Amazon CloudWatch and Route 53 DNS services. The new web EC2 instances launched by Amazon AutoScaling are added to the Amazon ELB for Load balancing automatically and whenever load drops; existing EC2 instances can be removed by Amazon Auto Scaling from ELB. Amazon AutoScaling and CloudWatch cannot be integrated seamlessly with HAProxy EC2 for this functionality. But HAProxy can be integrated with Route53 easily for DNS RR/Weighted algorithms.   

Cost: If you run a ELB in US-East Amazon EC2 region for a month (744 hrs) processing close to 1 TB of data, it will cost around ~26 USD (ELB usage+Data charge).  In case if you use HAProxy (2 X m1.large EC2 for HAProxy, S3 backed AMI, Linux instances, No EBS attached) as base capacity and add upto 4 or more m1.large EC2 depending upon traffic. It will minimum cost 387 USD for EC2 compute + Data Charges to start with. it is very clear and evident that larger deployments can save lots of cost and immensely benefit using Amazon ELB compared to HAProxy on EC2.  


jeff said...

Harish - Great article as always. It might be worth discussing two additional points: Portability (this used to be a bigger issue before TopStack ELB) and use of external IP's (vs host names) for the LB. This is an issue for some people.

Willy Tarreau said...

This is a very nice and objective comparison. Concerning the costs, for small to medium deployments, haproxy can cost exactly zero because you can install it on one of your existing servers. So you already have the server, you already pay for the public traffic and all inter-server communications are on the private network so they are free.

Even better, when you enable compression on haproxy, it can even reduce your bandwidth costs, because even if your servers don't compress, they will transfer the uncompressed traffic on the private addresses (free) and haproxy will compress the data before moving them to the public addresses.

These clearly are important points to consider.

I'm adding a link to your article on the haproxy web site because it's very useful in my opinion.


Anonymous said...

Great article, Harish.

Having been hit by HTTP/503 responses in a spikey app, which gets flooded very suddenly, I've been looking for information that might explain how ELBs respond to surges. Do you have any pointers to numbers which quantify how AWS's "gradually increasing" load pattern pans out, Harish?

I'm reluctant to go a Round-Robin DNS route, but right now that looks like the best solution for our app, unless pre-warming before the surge is realistic via (say) a suitable scheduled Apache Bench run. Our surge is very predictable.

Anonymous said...

i finally tracked down some numbers on how fast elb scales. from here:

"We recommend that you increase the load at a rate of no more than 50 percent every five minutes."

in my experience it does scale faster than that, but not reliably. in other places i read about it scaling every five minutes. that is a lifetime when you turn on a popular ticket sale. my ec2s sitting behind it aren't even at 50% capacity and users are getting timeouts and 503s. frustrating!

Harish Ganesan said...

if you are planning to run a ticket campaign, you can request AWS team to warm up ELB with requests/sec. This way during the campaigns you do not end up losing valuable business.

Gary L said...

Thank you for the post. Very useful.

A question on the "Cost": in your example, with ELB, is the outbound data (going thru ELB) charged at all (excluding the $0.008/Gb)? In the HAProxy case, the outbound data going thru HAProxy would be charged at roughly $0.12/Gb. Not sure about the ELB case. Would you clarify? Thanks.

Need Consulting help ?


Email *

Message *

All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.