Monday, January 21, 2013

Part 2: Log archival with Amazon S3 and Amazon Glacier

Log archive & analysis with Amazon S3 and Glacier - Part II

This blog article is syndicated from original article written at
You can connect with author@

In the previous post, we saw how to configure logging in AWS CloudFront and start collecting access logs. Let's now move on to the next tier in the architecture - web/app. The following are the key considerations for  logging in web/app layer:
  • Local Log Storage - choosing the local storage for logging. Using a storage option that is sufficient, cost-effective and meets the performance requirement for logging
  • Central Log Storage - how do we centrally store log files for log analysis in future
  • Dynamic Infrastructure - how do we collect logs from multiple servers that are provisioned on demand
Local Log Storage
Except very few cases, EBS-backed Instances are the most sought after Instance type. They launch quickly and easier to build Images out of them. But they come with couple of limitations from logging perspective
  • Limited storage - EBS-backed AMIs that are provided by AWS or third-party providers come with limited storage. For example, a typical RHEL AMI comes with around 6GB of EBS attached as the root partition. Similarly Windows AMI come with 30GB EBS attached as C:\
  • Growing EBS - log files tend to grow faster. And it becomes difficult to grow the root EBS (or an additional EBS) as the log file sizes grow
  • Performance - any I/O operation on an EBS Volume is over the network. And it tends to be slower than local disk writes. Specifically for logging, it is always better to remove the I/O bottleneck. Otherwise lot of system resources could be spent towards logging
Every EC2 Instance comes with ephemeral storage. These are local storage directly attached to the host on which the Instance is running. Ephemeral storage do not persist between stop-start cycles of an Instance (EBS-backed) but they are available when the Instance is running and persist during reboots. There are couple of advantages of Ephemeral storage:
  • They are locally attached on the physical host on which the Instance runs and hence have better I/O throughput when compared to EBS Volumes
  • They come in pretty good size - for example a m1.large Instance comes with 850GB of ephemeral storage
  • And it comes free of cost - you aren't charged per GB or for any I/O operations on the ephemeral storage unlinke EBS
This makes ephemeral storage the ideal candidate for storing log files. For an EBS-backed Instance, the ephemeral storage is not mounted and readily available. Hence one needs to follow the following steps to start using the ephemeral storage for storing log files
  • The logging framework usually comes with a configuration file to configure logging parameters. The log file path needs to be configured to point to the ephemeral storage mount directory that we create below
  • All application related files (such as binaries, configuration files, web/app server) will be installed on the root EBS. Before the final AMI is created, the ephemeral storage needs to be setup and configured
  • Run fdisk to list all the storage devices that are mounted
fdisk -l
  • Created a directory such as "/applogs". This is the directory where the ephemeral storage will be mounted
mkdir /var/log/applogs
  • Mount the storage device in this directory using the "mount" command
mount /dev/xvdj /var/log/applogs
  • Add "fstab" entries so that the ephemeral storage is mounted in the same directory after stop/start or when new Instances are launched out of this AMI
/dev/xvdj  /var/log/applogs xfs defaults,noatime,nobarrier,allocsize=64k,logbufs=8 0 2
/dev/xvdj /var/log/applogs    ext3    defaults        0   0

The last step is essential especially from AutoScaling point of view. When AutoScaling launches new Instances, the ephemeral storage needs to be automatically mounted in the directory so that the application can start logging. Now, we can go ahead create the final AMI and launch Instances from them. The new Instances will have the ephemeral storage automatically mounted in the "/var/log/applogs" directory and applications can start storing the log files in them.

No comments:

Need Consulting help ?


Email *

Message *

All posts, comments, views expressed in this blog are my own and does not represent the positions or views of my past, present or future employers. The intention of this blog is to share my experience and views. Content is subject to change without any notice. While I would do my best to quote the original author or copyright owners wherever I reference them, if you find any of the content / images violating copyright, please let me know and I will act upon it immediately. Lastly, I encourage you to share the content of this blog in general with other online communities for non-commercial and educational purposes.