Pages

Monday, February 18, 2013

Log Analysis and Archive with Amazon S3 and Glacier - Full Summary

Logging is an essential component of any system and helps you understand what's really going on in the system. Just like how you build systems that can scale, perform performance tweaks, design caching architecture, logging is an area that requires that special care to effectively collect logs and make some meaning out of it.

In the Cloud, and more specifically in AWS, there are numerous options and considerations with respect to logging such as
  • What are the different sources from where you can collect logs
  • How do you collect logs from a dynamic infrastrucuture
  • How effectively logs can be collected without affecting the performance of the system
  • What are the different storage options available
  • And most importantly how one can do it cost effectively
When I set to write on this, I understood that it is going to be a lengthy article with many areas being covered. And logging is an area whose importance is understood only when things go wrong. Otherwise it is pretty boring :) So I decided to split my thoughts in to multiple posts and had been writing about it for the past one month. So this post is a summary of all those different posts.

The Introduction - this is the introductory post setting the context of the different areas that we are going to cover as part of this multi-part post

Part I - in this part, we define the log structure and look at how to collect logs from Amazon CloudFront, the Content Distribution Network service from AWS

Part II - this post describes on how to use the local storage of the EC2 Instance for logging

Part III - part III discusses on how to collect from multiple instances that are dynamically provisioned, how to rotate the log files and store them in a centralized log storage

Part IV - In this final post, we look at what different storage options are available  for cost effective logging, how one can use Glacier, the archival service from AWS, the best practices that one needs to remember and a list of third party / commercial log management solutions available in the market

I hope this of some use to you and provides some insights on logging in AWS. I would definitely like to hear any comments and alternative approaches towards this.