Monday, April 8, 2013

Part 1: Understanding Amazon Elastic Block Store

Introduction to Amazon Elastic Block Store
Amazon Elastic Block Store (EBS) provides block level storage volumes for use with Amazon EC2 instances. In a typical block level storage device, raw storage volumes are created and then the server-based operating system connects to these volumes (over Fibre Channel, iSCSI etc) and uses them as individual drives. This fundamental flexibility makes block level storage usable for variety of application needs like file storage, database storage, virtual machine volumes and more. You can have variety of file system running on the block level storage, example NTFS for windows and ext3/XFS etc for Linux. In simpler terms Amazon EBS is like a massive SAN (Storage Area Network) in the AWS infrastructure. The physical storage could be hard disks, SSD’s etc under the EBS hood. Amazon EBS is one of the most important and heavily used storage services of AWS, even the building blocks like RDS, DynamoDB, CloudSearch etc possibly rely on EBS in the Cloud.
In Amazon EBS you can allocate a disk volume of 1GB to 1TB in size and data written will persist independently (unlike ephemeral disks) from the life of an Amazon EC2 instance attached with it. The volume is internally stored on redundant disks in an Amazon Availability Zone scope, which means that the EC2 instances using the EBS Volumes also should reside on the same AZ. The data is automatically replicated within the same Availability Zone (internally some form of RAID is employed by AWS) to prevent data loss due to failure of any single hardware component. Since the Amazon EBS lifetime is separate from the instance on which it is mounted, you can detach it and later attach it on another EC2 instance in the same availability zone.

Following are the few definitions which readers need to be conversant before going deep into this multi part article series :

Input Output Operations Per Second
Read / Write rate to storage in seconds (MB/s)
Volume of Data that can be stored (GB)
Availability Zone within the Same Amazon EC2 region
Storage Area Network
Redundant Array of Independent Disks

EBS Volumes can currently be classified into two types. They are Standard EBS Volumes and Provisioned IOPS Volumes. Standard EBS Volumes are the first generation EBS Volumes that are suitable for sequential IO workloads. PIOPS Volumes are more consistent and are targeted towards OLTP workloads. Let us explore in detail about each type of EBS Volumes ?  what are the positives and negatives of using the same ? Any best practices, tips, points to remember while incorporating the same in the architecture. Refer the articles here Standard & PIOPS.

One of the biggest advantages of running your infrastructure in Amazon Web Services is the flexibility and the choice it offers to the architects while making critical decisions. Though cost savings is not the main reason why users migrate to AWS, it is also one of the important factors that needs to be taken care for efficient on going operations in on demand nature of Amazon Cloud. Making the right choice between Standard EBS Volumes and PIOPS volumes can have impact in the infrastructure cost savings in AWS infrastructure. By having an detailed understanding of what use cases best fit these volume types, you can accordingly provision suitable volume types in the architecture stack. Choosing a wrong volume can lead to a cost leakage between 20 - 70% based on the analysis done. Refer the article here. 

I read a line in AWS Documentation about EBS that " 10% of your provisioned IOPS 99.9% of the time" - What does this mean ? and why it is important ? Refer article here

We all know that AWS team constantly listens to the user inputs and delivers their product road map tying them. One such instance is their introduction of EBS Optimized EC2 instance. In normal EC2 instance, the Network is shared between your front facing workloads and your EBS IO operations. This often leads to congestion in high traffic web sites that rely a lot on OLTP database systems. Even though you RAID your EBS volumes to extract more IOPS, if the pipe between the EC2 instance and your EBS is constantly choked you end up scaling your instance types to get larger bandwidth. On cost and compute perspective , this is not an ideal scenario. AWS introduced EBS optimized instances specifically to tackle this situation and help users with better performance while using EBS volumes. Refer this article to understand how EBS optimized instances can help you extract better performance from your architecture.

Extracting higher performance from EBS is always an topic of interest to me as well lots of AWS enthusiasts and users. In order to achieve the same you need deeper understanding on IO latency and IO Block size. The IOPS rate you get usually depends on the I/O size of your applications’ reads and writes. It is very important to know the I/O size your application operates and accordingly expect the performance from EBS Volumes.Latency is a measure of how long it takes for a single I/O request to happen from the apps perspective.Only IOPS numbers alone are meaningless without considering additional metrics such as latency, read/write % and I/O block size etc.
Let us explore in this section, how these two factors can affect your performance expectation while using Amazon EBS and why it is important to know about them before tuning  Refer articles here IO Latency and IO Block Size.

Performance extraction from Amazon EBS :
Pre warming the EBS Volume. Click here.
EBS Striping. Click here.

Since EBS volume redundancy is limited to Availability Zone scope , it becomes a necessity incrementally backup the data stored in them frequently to a much more robust system like S3. EBS snapshot is one such technique that allows us to backup the state of a system at a particular point in time to S3 and they are incremental in nature. Users have lots of confusion on how these snapshots work internally, what happens if i delete an older snapshot ? what benefits it brings to the operations ? In this section we deep dive into the workings of EBS snapshots and answer questions like what are some of the objectives for using them? How to achieve consistent snapshots in AWS? what are the best practices, tips , points to note while using them. Refer article here.

Though EBS snapshots are incremental in nature , it is recommended to have a proper snapshot deletion and retention strategy in place. If this mechanism is not addressed as part of your on going IT operations, you can end up with a 70% cost leakage in some scenariosRefer this article to understand in detail the analysis behind the cost savings. 

EBS snapshots have Amazon EC2 regional scope. In case you are planning to geographically expand or have data backed up on different Amazon EC2 region at regular intervals to minimize data loss and recovery time, you can use EBS snapshot copy feature for this purpose. How to copy EBS snapshots, Time taken between regions for copying ? Points to remember while implementing the same are covered in this article. Click here to understand more about EBS snapshot copy feature.

Common best practices for securing your EBS like IAM your EBS volumes, Wiping Data from EBS Volumes, Sharing Amazon EBS snaphsots,Storing AWS Credentials on an EBS Snapshot Securely are discussed in this article. If you want deeper protection for EBS you can also encrypt the volume using TrendMicro Secure Cloud. Check this article to get instructions for implementing the same in Amazon Web Services infrastructure

EBS Article Series (continued..)

Part 1: Understanding Amazon Elastic Block Store
Part 2: Understanding Standard EBS Volumes
Part 3: Understanding EBS PIOPS Volumes
Part 4: Understanding EBS-Optimized Instances
Part 5: Understanding Latency in EBS
Part 7: 10% of your provisioned IOPS 99.9% of the time
Part 8: Performance Tuning - Pre Warming the EBS volume
Part 9: Performance Tuning - EBS Striping
Part 10: Performance Tuning - IO Block Size
Part 11: Understanding Amazon EBS Snapshots
Part 12: Securing Amazon EBS volumes - EBS Encryption using SecureCloud (new)
Part 13: Amazon EBS Security Best practices and tips
Part 14: How proper EBS Snapshot Retention strategy can save costs (new)
Part 15: Amazon EBS Snapshot copying- How-to, Performance and Tips (new)
Part 16: Make right choice between PIOPS vs Std Volumes and save cost (new)

No comments:

Need Consulting help ?


Email *

Message *

All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.