Amazon EBS snapshots are incremental backups, meaning that
every snapshot only copies the blocks in the volume that were changed since the
last snapshot. The TOC and only changed
blocks are copied (in compressed form) to the S3 in subsequent snapshots. If
you have a volume with 10 GB of data, but only 2 GB of data have changed since
your last snapshot, only the 2 GB of modified data is written to Amazon S3 during
the snapshot process. When the blocks are copied they are compressed before
getting stored in Amazon S3.
|
In this article based on my experience have suggested some patterns to understand EBS snapshot costing in detail and how to avoid leakages and save money while using them.
Imagine you have 1.5 TB of EBS volumes of which ~ 1TB is occupied. Let us explore the patterns based on above assumptions:
Imagine you have 1.5 TB of EBS volumes of which ~ 1TB is occupied. Let us explore the patterns based on above assumptions:
Read only DB: Though
practically you will not find databases that are completely read-only for understanding
purposes let us imagine there is one. Whenever you take a snapshot from this
database after first full snapshot copy (~1 TB), it won’t occupy much storage
space for subsequent snapshots because there is literally no change in this
database, so no changed blocks and hence no copy to Amazon S3 also occurs.
Imagine you have and automated Snapshot mechanism every day for this database,
you can go ahead and delete all the old copies of the snapshot regularly and
retain only the latest copy. Even if you
miss to delete the old copies you will not end up much with leakage in cost in
this case, because you will be paying for the 1 TB snapshot (first time) and not
much for the subsequent snapshot storage.
Normal Read-Write DB:
Read-Write ratio (90:10) is a normal pattern that can be
observed in many Databases. Imagine
you have a ~1 TB EBS (used) and everyday there is 7-10% data change on the
volume, assuming efficient compression
in place and close to ~30GB changed data is copied to S3. Assuming the first
full snapshot process will take ~1 TB snapshot storage space in S3 and then
every day the incremental will be ~30 GB to Amazon S3. If retention period is 30
days for the snapshots, then 900 GB of snapshot storage will be additionally
added in S3 totaling to ~1.9 TB in 30 days.
In case the IT team does not have a mechanism in place to
delete the snapshots regularly, then over a year they would have aggregated
following cost leakage:
1024 GB (full snapshot) + 11824 GB (Aggregated @900 GB a
month) = 11824 GB of snapshot storage consumed. This equates to ~7933 USD in
cost @0.095 $ per GB of snapshot storage cost in Amazon S3.
Is there a better way to cut this leakage and reduce the
cost in Amazon S3? Yes, since snapshots are incremental in nature, the old snapshots
can be deleted anytime. The IT team just needs to have proper mechanism in
place to manually or automatically delete the older snapshots. Depending upon
the application characteristics, they should have proper retention periods (with
deletion) and snapshot version maintenance strategy in place. This way they can
efficiently manage the snapshot storage and reduce the cost leakage. Imagine
the same IT team has a 30 days retention strategy and mechanism in place for deleting
the snapshots; now let us revisit the costs:
1024 GB (full snapshot) + 900 GB (maintained @900 GB a
month) = 1924 GB of snapshot storage consumed. This equates to ~2290 USD in
cost @0.095 $ per GB of snapshot storage cost in Amazon S3. This snapshot
retention/deletion process in place easily translates to 70% reduction in cost
leakage and savings. Refer the below
table for cost comparison and savings
More savings can be achieved in some use cases if the
retention periods are much more compact. If your application has more write
ratio, then efficient snapshot deletion strategy in place can help you save
more costs.
|
Frequency of snapshots depends purely on the RTO/RPO of your
DB. Some of the common patterns I have observed are 5-10 minutes, 1 hour and 1
day for Snapshots. You need to have file system like XFS to freeze while taking
snapshots for consistency. Take snapshots from the Slaves and incase if you take
it once a day, Schedule it when least activity is there in your DB.
Other Tips
Cost Saving Tip 1: Amazon SQS Long Polling and Batch requests
Cost Saving Tip 2: How right search technology choice saves cost in AWS ?
Cost Saving Tip 3: Using Amazon CloudFront Price Class to minimize costs
Cost Saving Tip 4 : Right Sizing Amazon ElastiCache Cluster
Cost Saving Tip 5: How Amazon Auto Scaling can save costs ?
Cost Saving Tip 6: Amazon Auto Scaling Termination policy and savings
Cost Saving Tip 7: Use Amazon S3 Object Expiration
Cost Saving Tip 8: Use Amazon S3 Reduced Redundancy Storage
Cost Saving Tip 9: Have efficient EBS Snapshots Retention strategy in place
Other Tips
Cost Saving Tip 1: Amazon SQS Long Polling and Batch requests
Cost Saving Tip 2: How right search technology choice saves cost in AWS ?
Cost Saving Tip 3: Using Amazon CloudFront Price Class to minimize costs
Cost Saving Tip 4 : Right Sizing Amazon ElastiCache Cluster
Cost Saving Tip 5: How Amazon Auto Scaling can save costs ?
Cost Saving Tip 6: Amazon Auto Scaling Termination policy and savings
Cost Saving Tip 7: Use Amazon S3 Object Expiration
Cost Saving Tip 8: Use Amazon S3 Reduced Redundancy Storage
Cost Saving Tip 9: Have efficient EBS Snapshots Retention strategy in place
No comments:
Post a Comment