Applications can be often made to perform better
and run faster by caching critical pieces of data in memory. Frequently accessed data, layers of HTML
fragments, results of time-consuming/expensive database queries, search results,
sessions, results of complex calculations and processes are usually very good
candidates for cache storage. In general not all application architectures will
benefit from having a caching solution in their system, example applications
that are read intensive, will usually have better performance gains using cache
whereas write intensive applications may not get much benefit.
There are various ways in which caching layer can
be designed in AWS infra. The most popular model is distributed caching using Memcached.
Memcached is a high-performance, in-memory
key-value store with distributed memory object caching system.
Anatomy of a Memcached system
- A
memcached client which is given a list of available memcached servers in
the farm
- A memcached
client-based hashing algorithm which decides (GET/PUT) a server based on
the "key" input.
- Memcached
server instances which stores your values with their keys into an internal
hash table
Memcached uses memory and NW heavily followed by
CPU. Memcached supports TCP and UDP protocols in binary and text format for communication.
Memcached supports client libraries for popular programming languages like
Java, .Net, PHP, Python, Ruby etc.
In this article let us explore and analyze some
popular memcached/ElastiCache deployment architectures in AWS.
Architecture 1: Apache + Memcached shared in same EC2
(Distributed Cache)
Memcached is shared with Apache in the
same Amazon EC2 instance. Imagine m1.large EC2 instance where 7 GB RAM and 2
CPU Cores are shared between OS, Apache and memcached. Since Amazon ElastiCache
runs on a separate tier it will not fit this shared approach. The Apache-A can
contact any memcached-A/memcached-B or memcached-X node depending upon the Key/Value
hash. Since the unused memory is shared with memcached and no dedicated EC2
instances are launched for caching tier this model is usually cost effective. We
have seen some implementations using this approach in production but in my opinion
it is not suitable for applications which demands heavy scaling and clear service
separation. This model can be scaled up and but not scaled out optimally, ie
based on your traffic demand you can Scale up the shared EC2 instance to bigger
capacities like Xlarge, Quadruple, High IO ,M3 class etc. but not easily add
new instances of this type. Some negatives that this approach brings to the
table are:
Maintenance: Since the
Apache and memcached are shared in the same EC2 it is strictly advised not to
Auto Scale this layer. Only Manual Scaling is possible in this layer which might
add heavy configuration burden to IT team during traffic peaks and valleys.
Auto Scaling: Using
Amazon Auto Scaling we can add new web/app EC2 instances dynamically depending
upon the traffic demands. But when web/app Server instance also contains memcached
running in it, it brings in cascading complexity into the architecture. Imagine
there are 2 m1.large Apache + memcached EC2 instances running and a third one
is launched by Amazon Auto Scaling based on the traffic. Now the load balancer
splits the 1/3rd of the web traffic to this Apache EC2 instance. Since the
cache is empty 1/3rd of the requests will now hit the backend database
heavily. Now imagine instead of 1 EC2 you are auto scaling out by 2 Apache EC2
during peak, this will increase Database load to 50% more because of un warmed memcached.
Secondly, the new memcached endpoint has to be propagated and configured on
other memcached clients, which adds another complexity and devops engineering into
the architecture. Finally, Amazon Auto Scaling will pull out an Apache EC2
instance when the load decreases, now if you are pulling out an Apache + memcached
that is properly warmed it will again increase the DB load because of the cache
miss.
Note:
We can still try to address this problem by adding more complexity of
designing/engineering with progressive weighted EC2 balancing + Scaling out and
internal Cache Warming techniques etc, but if you deeply ask a question is it
worth it, it is not many times. Alternatively, we can altogether avoid this
complexity by simplifying the overall architecture of the system, which we will
see as the article progresses.
Sharing: We observed
earlier that sharing Apache+ memcached on same EC2 saves cost. On the other hand,
this sharing also causes problems if one is not aware of the environment. In
our case Apache + memcached are shared and Apache-A can talk to memcached in
same EC2 or other Apache EC2 instance as well depending upon KV hash. Based on
this flow let us explore some problems in sharing approach.
- · Apache is usually heavy on Memory and CPU. Memcached is low on CPU, high on memory and network depending upon the average size of your items.
- · If the memcached is not configured with Memory limits it can crash your Apache and OS. If the Website is heavily loaded and built to be cache dependent there will be heavy CPU contention between Apache and memcached.
- · If the request/response of Apache and memcached are bigger in size there will be bigger contention on the shared network layer. Overall request throughput can reduce because of heavy buffering and NW contention.
- · Apache EC2 instance will now have bigger headache of handling all the TCP sockets flowing between Internet, Database, Internal NW and memcached. Some of them address the last point marginally by using UDP protocol for memcached communication and reduce the TCP socket temporary exhaustion. Overall this is a stitch and not a proper solution.
Architecture 2: Apache + Memcached shared in same EC2 (like Local
Cache)
This approach has a slight difference
from the above one. Apache/NginX and memcached
are shared in the same EC2 instance, but the Web (Apache or NginX) process will
strictly and only call the local memcached and will not call the remote memcached.
Basically, memcached is used here as a local instance cache and not as a
distributed cache. Every Apache/NginX will cache items in memcached and use it
as extended memory. Since the items are coming from the same EC2 instance, the
throughput and latency are better for cached entries. Though this approach has
lesser configuration headache than the previous approach, it still inherits
lots of problems from the previous one. Session sticky algorithm is preferred on
the Load balancing tier to optimally reuse the cache items and reduce the DB
load because RR algorithm can heavily exercise the DB during initial cache
warming phase. Rapid scaling out and scaling down should be avoided on smaller deployments because it transfers
the load on the DB immediately. If already there is a large fleet (100’s) of
NginX + memcached running then rapidly adding few (5-10) EC2’s of this kind
will not have huge problems on the DB. Proper architecture guidance is
recommended before fitting the above architecture into the use case.
As we observed in detail, the above approaches might be cost
effective for smaller deployments , but as the site gets popular, traffic
increases and it demands scalability it will become complex to handle. Usually
in architecture if complexity arises because of improper designs, it will be
followed by heavy maintenance and management cost.
Now that we have understood the
impacts sharing memcached with web/app server, a simple solution is to split
the memcached into separate EC2 instance. Recently introduced M3 class instance
types are good candidates for designing separate memcached tier. But the
question is, whether we really need to manage and maintain a separate
additional memcached layer. The answer is NO, USE AMAZON ELASTICACHE.
Amazon ElastiCache is a web service that is protocol-compliant with Memcached, a widely adopted memory object caching system, so code, applications, and popular tools that you use today with existing Memcached environments will work seamlessly with the service.
Architecture 3: Apache + Amazon ElastiCache in separate tier
Apache and Caching runs on clearly
separated tiers in this approach. Since
the tiers are separated Apache EC2 can be easily scaled out using Amazon Auto
Scaling or custom scaling. Dynamically launching/terminating Apache instances
will not swamp database because the warmed cache is separated and still
accessible by all the Apache EC2 instances. It is also easy to roll out
configuration changes, add new nodes in caching layer and propagate the changes
to the cache clients. The clear separation also enables us to isolate and
address issues creeping up in Apache and Caching layer individually.
ElastiCache nodes are grouped inside
an ElastiCache cluster. An ElastiCache cluster
is a collection of one or more cache nodes, each running an instance of the memcached
service. The word cluster in this context should be related to “grouping” and not “data synchronization” because ElastiCache nodes will not talk to
each other or exchange information between them inside the cluster. Most of the
operations like configuration, security and parameter changes will be performed
at the cache cluster level and not at individual cache node level. This enables
easy maintenance and management of the caching tier on whole. Since ElastiCache
is also protocol compliant with memcached, programs written in Java, PHP, and
Python on Apache can still use their respective memcached clients and perform
SET/GET operations seamlessly. The ElastiCache Node end points (like“ecache1a.sqjbuo.0001.use1.cache.amazonaws.com:11211")
needs to be configured on the memcached clients of the Apache EC2. “ecache1a”
is the cluster name, “0001” is the node number and 11211 is the port in the
above mentioned URL endpoint. Whenever a new node is added into this “ecache1a”
cluster, a sequence of numbers like “0002, 0003” will be assigned in end point
URL to the nodes. This predictive pattern helps us to automate the detection of
cache node endpoints in client side of scalable environments. Since a single ElastiCache cluster can currently span
only in a single Amazon Availability zone, it is
advised to keep both Apache EC2 and ElastiCache Instances in same Availability
zone for improved latencies. Inside Single AZ a
single SET/GET operation between Apache and ElastiCache will take around ~1-5
milliseconds using AWS High Memory Quadruple Instance types. This latency measurement
also depends upon parameters like Apache EC2 instance type, ElastiCache Instance
type, size of the SET/GET requests, Single or Bulk operations etc. Imagine you
use m1.large for Apache and ElastiCache instance and every SET/GET is around 1
MB size. Then if the available NW bandwidth between Apache EC2 to ElastiCache
is only 15 MB at that instant of time, only 15-20 requests can be performed
concurrently at that instant. You may find the CPU under-utilized and max
connection well set in ElastiCache, but still the throughput is less because of
the above reason. This is not the problem of ElastiCache performance, but
rather a bad understanding of the architecture components behaviors. If the web
app is cache dependent, it is advised to spread the items in multiple cache
nodes. Imagine you have close to 20 GB Cache size requirement. You can distribute
it in either 2 m1.xlarge ElastiCache nodes or 4 m1.large ElastiCache nodes. The
cache data will be distributed by the memcached client to multiple nodes based
on the KV hash. In case one cache node goes down then 50% of the cache load
will now hit the backend data stores in m1.xlarge approach whereas 25% of the
cache load will only hit the data stores in m1.large approach. Also since it is currently not possible to
have multiple cache node instance types inside a single ElastiCache cluster, I
advise you to do proper capacity planning taking into consideration the cache
dependency and capacities of backend DB to take direct requests before planning
the cache node numbers, size and consolidation levels.
Amazon ElastiCache as the name
suggests you can automatically/manually add or remove cache nodes from the
existing ElastiCache cluster making the whole tier elastic and flexible for
customers. This is one of the important features of Amazon ElastiCache and this
feature eventually falls in line on any growing websites roadmap. Now let us try to understand the remapping
implications while adding or removing cache nodes from the cache cluster.
With
a normal hashing algorithm, changing the number of servers can cause many keys
to be remapped to different servers resulting in huge sets of cache misses. Imagine you have 10 ElastiCache Nodes in your cache Cluster, adding an eleventh server may
cause 40%+ of your keys to suddenly point to different servers than normal.
This activity is undesirable, may cause cache misses and eventually swamping
your backend DB with requests. To minimize this remapping it is recommended to
follow consistent Hashing model in your cache clients. Consistent Hashing is a
model that allows for more stable distribution of keys given addition or
removal of servers. Consistent Hashing describes methods for mapping keys to a list of servers,
where adding or removing servers causes a very minimal shift in where keys map
to. Using this approach, adding an eleventh server should cause less than 10%
of your keys to be reassigned. This % may vary in production but it is far more
efficient in such elastic scenarios compared to normal hash algorithms. It is also advised to keep memcached server
ordering and number of servers same in all the client configurations while
using consistent Hashing. Java Applications can use “Ketama library” through
spymemcached to integrate this algorithm into their systems. More information
on consistent hashing can be found at http://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients
Deep dive into Amazon ElastiCache and understand the internals like connection overheads, memory allocations, Elasticity implications in this article: http://harish11g.blogspot.in/2013/01/amazon-elasticache-memcached-internals_8.html
|
Architecture 4: Apache + Amazon ElastiCache
in Multiple Availability Zones
This
is an extension of the previous approach, for better availability the cache
nodes are distributed among multiple Availability zones of an Amazon EC2
region. Most of the points discussed on the above approach will be applicable
on this architecture as well. Since the ElastiCache cluster currently cannot
span across multiple AZ’s you can create multiple ElastiCache clusters in
Multiple AZ’s. Example: you can create ElastiCache cluster “ecache1a” in Amazon
AZ - 1A and have a node launched with endpoint “ecache1a.sqjbuo.0001.use1.cache.amazonaws.com:11211”. In the
same way you can create another ElastiCache cluster “ecache1b” in Amazon AZ – 1B and
have a node launched with endpoint “ecache1b.sqjbuo.0001.use1.cache.amazonaws.com:11211”. Both
the cache nodes endpoint should be configured in memcached clients. Since the AZ concept is built transparently
by AWS, the memcached clients in Apache EC2 can distribute data seamlessly and
easily to both the cache nodes distributed across AZ’s. You can manage the cache clusters separately
as well you can distribute the data across AZ in this approach. In case an
entire AZ is affected still the cache nodes in the other alternate AZ will be
still accessible and functional. Instead
of DB getting swamped by 100% cache misses now you are reducing it to ~50% with
AZ distribution in this approach. This % can be reduced much more if data is
distributed among 2 or more AZ’s with more cache nodes inside them.
ElastiCache Maintenance Window allows
you to specify the time range (UTC) during which any scheduled maintenance
activities such as software patching or pending cache cluster modifications you
requested would occur. Scheduled maintenance activities occur infrequently
(generally once every few months) and will be announced on the AWS forum two
weeks prior to being scheduled. After maintenance window our cache nodes may
lose all the data stored in it memory and needs to be warmed again. Imagine
having a single ElastiCache cluster with 10 cache nodes and all of them needing
the cache warming phase after maintenance period, It puts heavy burden on your
DB and other backend data stores during this refresh phase and sometimes even brings
down your system to knees on heavy cache dependent architectures. Since AWS is
very elastic and flexible, either you can plan to increase your backend
capacity on demand for few hours to few days till the cache layer is adequately
warmed or leverage the multi-AZ ElastiCache approach. Imagine you have 4
ElastiCache clusters distributed in 4 Availability zones inside an Amazon EC2
region. You can configure maintenance windows spanning multiple days for
multiple cache clusters. Example ecache1a can have maintenance on Monday,
ecache1b on Tuesday so forth. This distribution of ElastiCache Maintenance
windows may give you enough time to warm cache nodes in phases and also helps
you avoid cache swamping your backend with requests simultaneously.
This architecture approach is not
suitable for smaller deployments running in single AZ’s. I suggest this for only
larger deployments where Apache EC2’s are auto scaled, Apache and ElastiCache
clusters are well distributed across multiple AZ’s so that overall cache item SET/GET
latencies are in acceptable levels.
Launch Amazon ElastiCache in 3 Easy Steps: http://harish11g.blogspot.in/2012/11/configuring-amazon-elasticache-launch.html
|
Architecture 5: Apache + Amazon ElastiCache
+ Redundancy
This
is a slightly different approach built with availability and redundancy. Apache
and ElastiCache are deployed in separate tiers. Apache EC2 can be individually
auto scaled across multiple AZ’s. Multiple
ElastiCache clusters are created spreading across multiple availability zones
inside Amazon EC2 region (till now very similar to previous approach). Certain items are redundantly cached in two
cache nodes in multiple AZ- ElastiCache clusters for better availability in
this approach. Results of time consuming and expensive data base queries, results
of complex calculations etc are good candidates for this approach. Imagine an
expensive query that pounds the database for ~250 or more milliseconds, if the
data does not change quite frequently in this case it can be redundantly stored
in 2 ElastiCache nodes. If cache node 1
is down or throws connection error or if cache item miss occurs then the
redundant cache node 2 can be requested for the same item. If the item is not
present in cache node 2 also then as last resort the DB is queried and latest result
is stored in both the cache nodes redundantly. Imagine it takes around 2-5ms
for single ElastiCache node to return a value hitting 2 cache nodes redundantly
still gives the results in ~10ms, which is far better compared to pounding and getting
the result from DB. This approach is not suitable for frequently changing data flows
because it may result in fetching stale data from the cache, for such scenarios
ElastiCache-> DB fallback approach is better. Also it is not necessary to
have redundancy built for all the cache nodes and cache clusters totally, you
should build redundancy only for specific cache nodes in the system. This
feature is not pre built on the memcached API’s currently and it has to be
manually implemented in the application code by crudely making multiple calls
to multiple sets of cache nodes. Though it reduces the overall GET time for
complex requests, your SET times will marginally increase because of multiple
requests made to cache nodes.
It is costlier and complex compared to other
architecture approaches mentioned above. But for some use cases it can save on
your database HW capacity cost heavily and provide immense infra cost savings
overall. It is suggested to carefully analyze the fitment of this approach
based on your use case, cost and maintenance needs.
Related Articles
Part 1: Understanding Amazon ElastiCache Internals : Connection overhead
Part 2: Understanding Amazon ElastiCache Internals : Elasticity Implication and Solutions
Part 3: Understanding Amazon ElastiCache Internals : Auto Discovery
Part 4: Understanding Amazon ElastiCache Internals : Economics of Choosing Cache Node Type
Launching Amazon ElastiCache in 3 Easy Steps
Caching architectures using Memcached & Amazon ElastiCache
Web Session Synchronization patterns in AWS
Related Articles
Part 1: Understanding Amazon ElastiCache Internals : Connection overhead
Part 2: Understanding Amazon ElastiCache Internals : Elasticity Implication and Solutions
Part 3: Understanding Amazon ElastiCache Internals : Auto Discovery
Part 4: Understanding Amazon ElastiCache Internals : Economics of Choosing Cache Node Type
Launching Amazon ElastiCache in 3 Easy Steps
Caching architectures using Memcached & Amazon ElastiCache
Web Session Synchronization patterns in AWS
9 comments:
Which is the best Java memcached client? I was using Danga memcached client but I think its deprecated. Which one is most used now? I'm a sys admin. Not a developer. So no idea about it. I have tried setting up elasticache + danga client for caching db queries. But it take more time than usual results. Any suggestions?
Randeep, try spymemcached or xmemcached for java
Hello Harish,
Great article.
You talked about the auto-scaling when we have memcached running in the same server as Apache. How does that work ? My understanding that your app should be aware of each memcached instance running for it to be a truly distributed cache. So unless you bring down all the instances and change the configuration in each to add the new server and start them up again, it will not work. Is that true auto-scaling ? Am I missing something here ?
Hi,
Thanks for your time on reading this article. I assume you are talking about architecture-1 in reference to Auto Scaling. Yes, you are right that Auto scaling a web/app+MemCached combination is not a easy proposition. I have mentioned this difficulty in the article "The new memcached endpoint has to be propagated and configured on other memcached clients, which adds another complexity and devops engineering into the architecture". What i mean by devops engineering is that you should engineer a centralized discovery service which is updated/queried by web/app+Memcached instances for new endpoints. Netflix asgard follows that line, but i have not fully evaluated it. Also as you rightly pointed it will require a restart of your web/app process to recognize the new memcached configurations. Thanks for your question, I will take this as a opportunity to elaborate this point in detail in the article.
Architecture 4 Question: if I am using the AmazonElastiCacheClusterClient-1.0.1 which allows me to read/write to memcache using the configuration endpoint then would I need to always write to both clusters if I have 2 availability zones for my deployment? Do I randomly choose one to read from and if it fails read from the other?
Thanks for your article.
Nice post!
Is there a coHQL equivalent in Elastic cache ? Is there a way to query the cache?
Thanks
Hey Harish,
This blog is a huge help for a beginner like me. I just have a simple query, naive it may sound though. Can we use a single node or a cluster across multiple EC2 instances? Or is it that ElastiCache is equivalent to (and in some cases better than) EC2 + Memcached? Hope you would clarify it for me. Thanks in anticipation.
Cheers
Post a Comment