Friday, April 5, 2013

Cost Saving Tips : Part 1 : Amazon SQS Long Polling and Batch requests


Amazon Simple Queue Service (SQS) is highly scalable, distributed queuing service provided by AWS. It is one of the first and oldest AWS service to be launched. It can be used inside your architecture for distributing messages between systems, services and components. SQS provides API’s for you to send and receive messages. SQS is widely used and is popular with variety of applications in media, ecommerce, learning and travel industries. SQS operates on “Pay for what you use” model, so your cost grows with your usage.
Imagine you have popular online application (with millions of user base) implemented on AWS that uses SQS.  The application tiers are loosely coupled and they transfer ~10 million messages a day between their internal systems/services. The message size varies between 1 - 4KB.The message Producers/Consumers are Multi-threaded and distributed across multiple EC2 instances for parallel processing and scalability. Now let us explore what are the costs/scenarios involved in this case:
Point 1: Since the EC2 and SQS are inside the same Amazon EC2 region, the data transferred between Amazon SQS and Amazon EC2 is free of charge.
Point 2: Currently (as of Apr-2013) AWS charges $0.50 per 1 million Amazon SQS Requests, which means if you have to make the essential calls like SendMessage, ReceiveMessage, ChangeMessageVisibility,  DeleteMessage to process the message, you end up making min ~4 requests per message(nonempty). At the current cost it equates to:
10 million messages X 4 non empty requests= 40 million no empty requests per day
40 million requests X 0.50 = 20 $ per day
For a month, it equates to 600 $ in payments to AWS for SQS. + Wastage cost (explained below)
Point 3: The above point assumes that the messages are flowing in your system as a continuous stream (even) and your distributed, multi-threaded consumers always can receive a non-empty message every DE-queue requests it makes. In reality the message flow will not be evenly distributed and there are times where you will get empty receive in your ReceiveMessage request. In this case let us assume the number of empty ReceiveMessages requests to be ~5X than the valid ones i.e. on the average for every 5 attempts you will get one receive message with non-empty message. This leaves us with ~40 Million empty receives equating to 20$ per day (~600 $ leakage a month).
Point 4: You can observe that ~10 million messages in the range of 1-4KB are produced and sent in a day per day using the SendMessage request. It will cost around
10 million SendMessage Requests X 0.50 = 5 $ per day
For a month, it equates to 150 $ in payments to AWS for SQS services.
Now let us explore some best practices for cost saving while using SQS:
Tip 1: We saw that polling messages using ReceiveMessage requests may bring empty receives which aids to inefficient utilization of resource and cost.  We can lower the cost of SQS by using “long polling” in our design. Long polling is available as part of AWS SQS SDK and it reduces the extra polling requests and helps you handle the message rate fluctuations & cost efficiently. With long polling technique embedded in your design, SQS waits for a message to become available and sends it to the client if the message arrives within a customer-defined time period. Since the extra and empty receives are considerably cut down using this technique, we can save around ~600 $ (empty receive leakage cost) in the sample case scenario.
Tip 2:  SQS has introduced batch send API’s using which you can send 10 messages or 64KB per request bundled in a single batch request. The API’s like SendMessageBatch, DeleteMessageBatch etc can be used in the design to reduce the number of requests hitting the SQS and thereby savings costs. In the sample use case scenario mentioned the message sizes are in the range of 1-4KB, so you can easily pack 10 messages in batch and send them as a request to SQS. This reduces the number of Send requests from 10 Million a day to ~1-2 million a day (assuming we are able to tightly pack 10 messages every batch). This translates to reduction in daily cost from 5$ to 1$ a day => 150$ to 30$ a month. 

Other Tips

Cost Saving Tip 1: Amazon SQS Long Polling and Batch requests
Cost Saving Tip 2: How right search technology choice saves cost in AWS ?
Cost Saving Tip 3: Using Amazon CloudFront Price Class to minimize costs
Cost Saving Tip 4 : Right Sizing Amazon ElastiCache Cluster
Cost Saving Tip 5: How Amazon Auto Scaling can save costs ?
Cost Saving Tip 6: Amazon Auto Scaling Termination policy and savings
Cost Saving Tip 7: Use Amazon S3 Object Expiration
Cost Saving Tip 8: Use Amazon S3 Reduced Redundancy Storage  (new)
Cost Saving Tip 9: Have efficient EBS Snapshots Retention strategy in place (new)
Cost Saving Top 10: Make right choice between PIOPS vs Std EBS volumes and save costs (new)
Cost Saving Top 11: How elastic thinking saves cost in Amazon EMR Clusters ? (new)



Need Consulting help ?

Name

Email *

Message *

DISCLAIMER
All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.

Followers