Logging is an essential part of any system. It let's you understand what's going on in your system especially serving as a vital source for debugging. Primarily many systems uses logging to let developers debug issues in the production environment. But there are systems where logging becomes the essential component to understand the following
- User Behavior - understanding user behavior patterns such as which areas of the system is being used by the user
- Feature Adoption - evaluate new feature adoption by tracking how a new feature is being used by the users. Do they vanish after a particular step in a particular flow? Are people from a specific geography use this during a specific time of the day?
- Click through analysis - let's say you are placing relevant ads across different pages in your websites. You would like to know how many users clicked them, the demographic analysis and such
- System performance
- Any abnormal behavior in certain areas in the system - a particular step in a workflow resulting in error/exception conditions
- Analyzing performance of different areas in the system - such as finding out if a particular screen takes more time to load because of a longer query getting executed. Should we optimize the database? Should we probably introduce a caching layer?
Any architect would enforce logging as a core component in the technical architecture. While logging is definitely required, many a times, inefficient logging such as too much logging, using inappropriate log levels might lead to the following
- Under performance of the system - the system could be spending more resources in logging than actively serving requests
- Huge log files - generally log files grow very fast, especially when inappropriate log levels are used such as "debug" levels for all log statements
- Inadequate data - if the log contains only debug information by the developer there will not be much of an analysis that can be performed
On the other hand, the infrastructure architecture also needs to support for efficient logging and analysis
- Local Storage - how do you efficiently store the log files on the local server without running out of disk space; especially when log files tend to grow
- Central Log Storage - how do you centrally store log files so that it can be used later for analysis
- Dynamic Server Environment - how do you make sure you collect & store all the log files in a dynamic server environment where servers will be provisioned and de-provisioned on demand depending upon load
- Multi source - handling log files from different sources - like your web servers, search servers, Content Distribution Network logs, etc...
- Cost effective - when your application grows, so does your log files. How do you store the log files in the most cost effective manner without burning a lot of cash
In this multi-post article let's take up a case of a typical e-commerce web application with the above characteristics and setup a best practice architecture for logging, analysis and archiving in AWS. We will see how different AWS services can be used effectively to store and process the logs from different sources in a cost effective and efficient manner.