Saturday, October 6, 2012

Part 3: Cost of Latency Series:Sample technical architecture using Route53 LBR

Complexities and Best Practices behind Geo Distributed + R53 LBR

Let us take a simple Geo distributed online app stack and explore the technicalities and best practices a bit:

DNS and CDN Layer: Configure Route 53 to manage the DNS entries,map Domain Names to CloudFront distributions to and Latency Based Routing entries. LBR records point to the Amazon Elastic Load Balancer's endpoint in Europe and Singapore. Amazon Route 53’s Latency Based Routing (LBR) feature will route Amazon CloudFront origin requests to the AWS Region that provides the lowest possible latency. Internally Amazon Route 53 is integrated with Amazon CloudFront to collect latency measurements from each Amazon CloudFront edge location, resulting in optimal performance for origin fetches and improving overall performance

Load Balancing Layer: Amazon Elastic Load Balancing (ELB) is used as the Load Balancing layer. ELB can elastically expand its capacity to handle load during peak traffic. Amazon ELB should be configured with SSL termination @ Apache Backends, for meeting Security and Compliance in case sensitive information gets passed. Round Robin Algorithm is ideal for most scenarios. ELB should be configured to load balance across Multiple –AZ inside an Amazon EC2 Region. For more details about architecting using ELB refer

Web/App Layer: Apache Tomcat EC2’s are launched from S3 backed Linux AMI’s in Multiple-AZ’s. Logs periodically shipped to S3. Amazon Auto Scaling can be configured based on CPU or custom metrics to elastically increase/decrease the EC2 instances across Multiple AZ (recommended approach for Scalability and HA). ELB, Amazon Auto Scaling, CloudWatch and Route 53 work together. Session State is synchronized on MemCached. For more details on Amazon EC2 Availability Zones refer

Solr Search Layer: Solr Search Instances are launched as EBS backed AMI’s. Solr EC2 can be replicated between Multiple –AZ’s or sharded inside an AZ depending upon need. High Memory instances with RAID levels (EBS Striping) + EBS optimized + Provisioned IOPS give better performance on AWS. Periodic Snapshots are taken and moved across regions. Sync Solr and DB periodically. For more details on Solr Sharding refer

Database Layer:
If the use case demands the Data to be localized inside an Amazon EC2 region then one of the following approaches are recommended:

  • ·   RDS with Multi-AZ for HA, HAProxy Load balanced RDS Read Replicas across Multiple AZ’s for Read scaling are recommended approaches
  • ·   MySQL Master with 1-2 Slaves spread across multiple AZ’s inside a Region , RAID 0 with XFS + EBS optimized + PIOPS for performance
If the use case demands the unidirectional Data synchronization across Amazon EC2 regions then:

  • MySQL Master can sync data to a MySQL Slave in another Amazon EC2 region. Data can be sent over SSL or clear according to the requirements. 
  • If the MySQL is inside VPC (private subnet) then IPSEC tunnel should be established across 2 Amazon EC2 regions for communication.
If the use case demands the Bi directional Data synchronization across Amazon EC2 regions then:

  • MySQL Master-Master across regions and Master-Slave inside Regions can be configured. Though the bi directional data synch can be achieved, transactional integrity will become complex. Overall this model is not very efficient when the number of AWS regions increase.
  • Usually the best practice is to streamline and avoid bidirectional sync and expose the function as common data web service that can be consumed over web. This way the Geo distributed applications in both the Amazon EC2 regions can consume that function for information.
Note: Geographically distributed Database is a hot field and lots of stuff are happening/emerging everyday like Google Spanner, Yahoo PNUTS, NuoDB, TransLattice Elastic Database, Cloudant, ClearDB etc. In coming days you will be using these systems which will make your life easier for architecting Geo Distributed applications. I also hope AWS product team comes with solution for this Geo distributed database problem.

Caching Layer: Use MemCacheD/ElastiCache for storing Sessions, results of Heavy queries, frequently used queries and complex queries of DB/Solr and thereby significantly reducing the database load . ElastiCache cannot be distributed over AZ. MemCacheD over Amazon EC2 with Multi-AZ distribution is recommended for website which heavily relies on Caching Layer. Cache need not be replicated across regions, in case needed you need to sync Master/Slave DB's replication with MemCacheD to ensure some consistency.

Storage Layer: S3 for storing Images, JS and other static assets. S3 can be the CloudFront origin. All logs, user uploaded files will be synched to S3 in Amazon EC2 region. 

View Full Detailed article at

No comments:

Need Consulting help ?


Email *

Message *

All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.