1:Shard the Data
Spread the data among Multiple SimpleDB Domains for better throughput. Many benchmarks from Internet suggests a single SimpleDB domain can handle 70 puts/sec/domain. Every account by default can create 250 SimpleDB domains and more domains can be added by filling this form.
Example Lets assume a Single SimpleDB domain offers 70 puts/sec/domain. Your application layers requires a conncurrency throughput of 7000 req/sec. In order to increase the overall write/read efficiency shard the data into 100 simpleDB domains.
2:Retries and Exponential Backoff
Amazon SimpleDB is a webservice call and you may encounter 500,503 errors sometimes. The usual technique for dealing with such error responses in AWS
is to implement retries in the application layer. The application implementing this technique can maintain excellent level of performance and availability because it can automatically handle the overload and server errors. This technique also increases the overall reliability of the applications consuming Amazon SimpleDB service.
In addition to simple retries, the best practice is using an exponential backoff algorithm for better flow control. The algorithm logic has to be built in your application layer code. The concept behind exponential backoff is to use progressively longer waits between retries for consecutive error responses: up to 500 milliseconds before the first retry, up to 1500 milliseconds before the second, up to 6000 milliseconds before third, and so on. The timings can vary depending upon your use case.
Refer this URL for more information : http://aws.amazon.com/articles/Amazon-SimpleDB/1394
3:Run from Amazon EC2
Amazon SimpleDB gives better performance in terms of latency if we execute the queries from Amazon EC2 . This is because network round trips are avoided when the web service calls are made from Amazon EC2. By default SimpleDB domains are created in USA-EAST AWS region. Applications accessing SimpleDB from APAC, Brazil, Tokyo AWS regions etc should make sure they select Amazon SimpleDB and Amazon EC2 from same region to get better performance. Also the network bandwidth usage is free within Amazon Region between EC2 and SimpleDB.
- Use BatchPut API instead of PutAttributes for better write performance. BatchPut API takes 25 items (or 256 attributes or 1MB request size) on a single domain. It works like Batch commit in RDBMS , so in case of failure all items are reverted. We have observed write throughput of 20X using BatchPut compared to single Put API.
- Avoid Non Indexed Queries like "Select * from Domain..." in Amazon SimpleDB
- When storing dates, it is recommended that you store all dates in Joda time and use a single time zone
- Zero Padding for sorting ( based on largest number in your Data set)
- Make sure you design Queries in Amazon SimpleDB domains that it will not run more than 5 seconds , beyond which Amazon SimpleDB will return error or clip them
5:Understand the SimpleDB limits
There are also certain limits that Amazon SimpleDB enforces which applies to the domain data size, domain names, query execution time, result set size, etc…please understand them before designing applications using SimpleDB