Raghuraman: August 2012

Monday, August 27, 2012

AWS updates that makes architects happy

AWS keeps adding a lot of features to its existing services and also keeps launching new services. With the pace of innovation from AWS and the rate at which new services and updates are released, one may lose track of them pretty easily. As architects it is essential to know all of these all the time so that one proposes the right solution for a given requirement. Here are some of the features and services that has made the lives of architects easy and can help in simpler infrastructure design addressing specific needs. There is no order to this list - I am just listing them down. And here's a summary before we go detail in to each of them

High I/O EC2 Instance Type - This is a new family of Instance types from AWS in addition to the existing High Memory and High CPU Instance types. The first one in this family is hi1.4xlarge which comes with 60GB of memory and 8 virtual cores. The best part of this Instance type is the whopping 2 TB of SSB attached to them. These SSDs are the ephemeral disks that come as part of the Instance and are of course non-persistent. AWS promises around 120K random read IOPS and between 10K and 85K write IOPS. This Instance type is suitable for I/O intensive applications and running NoSQL databases. Here are couple of pointers about this Instance type

SSD disks are ephemeral - you will lose data on Instance termination (of course if you stop and start the Instance)
Data will be persistent between reboot like any other EC2 Instance
You will see two 1TB volumes on the Instance. Can be striped to double the throughput
If you are launching a Windows Instance, choose the "Microsoft Windows 2008 R2 64-bit for Cluster Instances" AMI
Currently this Instance type is available only in US-East and EU-West regions
And its priced at $3.10 an hour :)

I am sure AWS will be releasing lower Instance types (such as High I/O large) since not everyone will be looking to use an 4xlarge Instance. There are already some benchmarks that you can refer here and specifically netflix has performed a detailed benchmark of this Instance type.

Provisioned IOPS for EBS - EBS is great for storing persistent data and the ability to attach and detach it to/from any Instance. But from day zero the performance and throughput has been an issue with EBS. The IO throughput that one gets out of EBS is very poor and all the more it is not consistent. If the network load during that time is not heavy then one might see large throughput from EBS. If not the throughput will hover around 100 IOPS. Fast forward to now and with provisioned IOPS we can set the IOPS that we expect from EBS and AWS will guarantee the IOPS at all times. This is great news for hosting databases on EC2 (if you are not using RDS). Here are couple of pointers on this:

Currently the max IOPS that can be set is 1000. AWS will soon increase this limit
We can of course drive more throughput by attaching multiple EBS volumes and striping them
Costs little extra than a normal EBS Volume - 0.125/GB against 0.10/GB. Of course the benefits will be far more than the cost difference. Guess everyone will eventually move to provisioned IOPS. Nice way to increase AWS revenue :)

I am assuming that provisioned IOPS will be extended to RDS as well in near future. Just like DynamoDB where one can specify the required throughput, AWS should let the user specify the throughput required from RDS. RDS already uses EBS behind the scenes for storing the data and hence this should definitely be in the pipeline. At least I wish so.

EBS-Optimized EC2 Instances - One thing with EBS that everyone understands is that it is network attached. Unlike local ephemeral storage which is directly attached to the EC2 Instance, there is a pipe which transfers all the data that will be written to EBS. So, even if we use provisioned IOPS to increase the throughput from an EBS, the network throughput can be significantly low that we may not utilize the increased EBS throughput; especially when the network is shared between many EC2 Instances. AWS has introduced EBS-Optimized EC2 Instances which have dedicated network throughput to the EBS that are attached to it. An EBS-Optimized m1.large Instance will have about 500 Mbits/s of guaranteed network throughput to EBS.

Currently only m1.large, m1.xlarge and m2.4xlarge are available as EBS-Optimized Instances
The guaranteed throughput is 500 Mbits/s for m1.large and 1000 Mbits/s for m1.xlarge and m2.4xlarge
This should not be confused with the regular network throughput of the EC2 Instance - for example talking to another EC2 Instance. This guaranteed throughput is only for the EBS related traffic. All other traffic is through the other network pipe given to the EC2 Instance (which is not guaranteed and differs by Instance)
Comes with additional hourly cost - Optimized m1.large will cost $0.345/hr against a normal on demand m1.large costing $0.320/hr
Always use EBS-Optimized Instance when using Provisioned IOPS

Elastic Network Interface (ENI) - This feature is available for Instances launched within Virtual Private Cloud (VPC). When you launch Instances within VPC you normally specify the subnet to launch it in and also specify the private IP address of that EC2 Instance. If the Instance is in a public subnet, you will attach and Elastic IP so that you can access the Instance over the internet. With ENI, you no longer specify a subnet while launching the Instance. You create an ENI (which is attached to a subnet) and attach the ENI to the Instance. So what are the benefits of using ENI?

You can attach multiple ENI (belonging to different subnets) to an EC2 Instance
One ENI can be configured to have a public IP (Elastic IP) and private IP and the other ENI can be configured to have just a private IP
With such a setup, you can allow traffic on port 80 through the first ENI and SSH traffic (port 22) through the other ENI
This way you can further secure your VPN architecture by allowing SSH traffic only from your corporate datacenter

Image Courtesy - AWS

Multiple IP Addresses per Instance - with this feature release, AWS addressed the long waited requested for hosting multiple SSL websites on a given Instance. This feature is an extension of the ENI feature, where each ENI is now allowed to have multiple public and private IPs. We can create an ENI and create multiple Elastic IPs and attach them to the ENI. The ENI will be then attached to the EC2 Instance (during launch or hot attach during running) and the Instance gets multiple IP addresses.

The multiple public IP address and private IP addresses are attached to the ENI. And to the Instance when we attach the ENI to the Instance. The difference is that, if we terminate an Instance and relaunch a new one with the same ENI, all the mappings will remain and the new Instance will also get all the public IPs and their private IP mappings
Available only within VPC currently
Additional cost for each of the additional public IPs that we attach. No cost for the additional private IPs attached

ELB Security Group - When you use an Elastic Load Balancer (ELB), you will obviously place EC2 Instances behind it to accept incoming traffic. For quite a while, on the EC2 Instances Security Group, you will have to open the incoming port (to which ELB will be sending traffic) to all IPs (0.0.0.0/0). This was because of the fact that ELB will scale itself and AWS will not know at a given point of time how many ELB Instances will be running. But now, ELB's come with their own security groups. So on your EC2 Instances (that sit behind the ELB) Security Group, you will have to open up the port to only ELB's security group. This way you secure your EC2 Instances to accept traffic only from ELB.

Internal Load Balancer - You can launch ELB within VPC as an internal load balancer. By this fashion, the ELB's do not have a public IP addresses and will only have private IP addresses. Such a load balancer can be used for internal load balancing requirements between different tiers in the architecture. I wrote a separate article explaining how it can be used.

Custom Security Group of ELB - If you launch ELBs within VPC (either internal or public facing) you have an option of specifying a custom Security Group of your own. This way custom rules can be configured on the ELB's Security Group instead of using the ELB's default Security Group (which cannot be edited).

S3 Object Expiration

This feature is great for storing logs and scheduling them to be deleted after a definite period of time. When you have multiple web or application servers you will definitely consider pushing the logs generated on them to S3 for centralized storage. Even if you are running a minimal setup, you certainly do not want run out of space on your Instance. But you do not want to store the logs perpetually - one month old logs will be good enough for most of the applications. Instead of writing custom scripts that periodically deletes the logs stored in S3, use this feature so that Amazon will automatically delete them. Saves some cost.

Route 53 Alias record

Route53 is a scalable DNS service from AWS. With pay-as-you-go pricing, there are whole lot of benefits (in terms of flexibility) with Route53. It is definitely not sophisticated like other DNS service providers in terms of feature set. Be it latency based routing or weighted round robin, the service is updated with features. One feature that will be of interest is "Alias Record". Let's say you have a website called "example.com". And you are hosting the infrastructure in AWS. If your load balancer layer uses AWS Elastic Load Balancing, then you will be provided with a CNAME for the ELB. ELB internally may employ "n" number of servers to scale itself and hence it may have more than one IP. Hence AWS provides the CNAME for an ELB which may resolve to one or more IPs at any time. With that background,

You want to create a DNS record for "www.example.com"
You also need to create a DNS record for "example.com"; the naked domain or top level domain. Most of the users will use only the naked domain

By design, DNS does not support adding a CNAME entry for the naked domain - simply put you cannot point "http://example.com" to an ELB but you can point "http://www.example.com" to an ELB CNAME. The naked domain can be pointed to only an A-record. Having only A-records for the naked domain will introduce complexity in HA and Scalability for the load balancing layer. And if your load balancing layer employs ELB, there is no way you will know the IPs before hand. One way to address this constraint is to put a separate Apache server which can do the translation. But this will be a serious bottleneck in the architecture since most of the users will use only the naked domain and soon the infrastructure will crumble even though we have build HA and Scalability in all other layers. With the "Alias" feature

First create a Route53 record set for "http://www.example.com" and point it to the ELB CNAME
Create another record set for "example.com" and set the type as "Alias". Point the record value to the record set created above. Or directly point it to the ELB CNAME
With the "Alias" option you will be able to point the record value to either an ELB CNAME or another record set that is available in Route53.

This way, you take care of both naked domain and "www" with both pointing to the load balancing layer in the AWS setup.

Of course there are a whole lot of features and updates that AWS keeps pushing across its services. The above are some of the features that helps architects while designing the infrastructure architecture. These have personally helped us build better architectures. Prior to these features, we always had to look for alternatives such has tuning, custom solutions or sacrifice on certain parameters to achieve others. I am sure there are many more such updates that we can uncover and will be coming from AWS in future. That will be for another post here.

Monday, August 20, 2012

The AdWantageS

Every customer has a reason to move in to the cloud. Be it cost, scalability, on demand provisioning, there are plenty of reasons why one moves into the Cloud. The latest whitepaper "The Total Cost of (Non) Ownership of Web Applications in the Cloud" by Jinesh Varia, Technical evangelist of Amazon Web Services provides a good insight between hosting an infrastructure in-house and in the Cloud (AWS). There are plenty of pricing models available with AWS currently which can provide cost benefits ranging from 30% to 80% when compared to hosting the servers in-house.

On-Demand Instances - this is where every one starts with. You simply add your credit card to your AWS account and start spinning up Instances. You provision them on demand and pay for how long you run them. You of course have the option of stopping and starting them whenever needed. You are charged for every hour of running an Instance. For example, a Large Instance (2 CPU, 7.5GB Memory, Linux) you will pay $0.32/hr (US-East).

Reserved Instances - let's say after you migrate/host your web application to AWS and run multiple web servers and DB servers in AWS. After couple of months, you may notice that some of your servers will run 24 hours/day. You may spin up additional web servers during peak load but will at least run 2 of them always; plus a DB server. For such cases, where you know that you will always run the Instances, AWS provides an option for reducing your cost - Reserved Instances. This is purely a pricing model, where you purchase the required Instance type (say Large/X-Large) in a region for a minimum amount. You will then be charged substantially less for your on-demand charge of that Instance. This way there is a potential savings of 30% over a period of one year when compared to on-demand Instances. The following provides an illustration of the cost savings for purchasing an m1.large Reserved Instance against using it on demand through an year

Cost comparison between On-Demand and Reserved Instance for m1.large Linux Instance in US-East

Be careful that,

Reserved Instances are purchased against an Instance type. If you purchase an m1.large Reserved Instance, at the end of the month when your bill is generated, only m1.large usage will be billed at the reduced Instance hourly charge. Hence, on that given month if you figure out m1.large is not sufficient and move up to m1.xlarge, you will not be billed at the reduced hourly charge. In such a case, you may end up paying more to AWS on an yearly basis. So, examine your usage pattern, fix your Instance type and purchase a Reserved Instance.
Reserved Instances are for a region - if you buy one in the US-East region and later decide to move your Instances to US-West, the cost reduction will not be applicable for Instances running out of US-West region

Of course, you have the benefits of,

Reduced cost - a minimum of 30-40% cost savings which increases if you purchase a 3-year term
Guaranteed Capacity - AWS will always provide you the number of Reserved Instances you have purchased (you will not get an error saying "Insufficient Capacity")

Spot Instances - in this model, you will bid against the market price of an Instance and if your's is the highest bid then you will get an Instance at your bid price. The Spot Market price is available through an API which can be queried regularly. You can write code that will check for the Spot Market price and keep placing bids mentioning the maximum price that you are willing to pay against the current Spot Market price. If your bid exceeds then you will get an Instance provisioned. The spot price will be substantially low than on demand pricing. For example, at the time of writing this article, the spot instance pricing for a m1.large Linux Instance in US-East was $0.026/hr as against $0.32/hr on demand pricing. This provides about 90% cost reduction on an hourly basis. But the catch is,

The Spot Market price will keep changing as other users place their bids and AWS has additional excess capacity
If your maximum bid price falls below the Spot Market price, then AWS may terminate your Instance. Abruptly
Hence you may loose your data or your code may terminate unfinished.
Jobs that you anticipate to be completed in one hour may take few more hours to complete

Hence Spot Instances are not suitable for all kind of work loads. Certain work loads like log processing, encoding can exploit Spot Instances but requires writing careful algorithms and deeper understanding. Here are some of the use cases for using Spot Instances.

Now, with that basic understanding, let's examine the whitepaper little carefully not just from cost point of view. Web applications can be classified in to three types based on traffic nature - Steady Traffic, Periodic Burst and Always Unpredictable. Here is a summary of the comparison of benefits of hosting them in AWS

AWS benefits for different type of web applications

Steady Traffic
The website has steady traffic. You are running couple of servers on-premise and consider moving it to AWS. Or you are hosting a new web application on AWS. Here's the cost comparison from the whitepaper

Source: AWS Whitepaper quoted above

You will most likely start with spinning up On-Demand Instances. You will be running couple of them for web servers and couple of them for your database (for HA)
Over long run (3 years) if you only use On-Demand Instances, you may end up paying more than hosting it on-premise. Do NOT just run On-Demand Instances if you your usage is steady
If you are on AWS for about couple of months and are OK with the performance from your setup, you should definitely consider purchasing Reserved Instances. You will end up with a minimum of 40% savings against on-premise infrastructure and about 30% against running On-Demand Instances
You will still benefit from spinning up infrastructure on demand. Unlike on-premise, where you need to plan and purchase ahead, here you have the option of provisioning on demand; just in time
And in case, you grow and your traffic increases, you have the option to add more capacity to your fleet and remove it. You can change server sizes on demand. And pay only for what you use. This will go a long way in terms of business continuity, user experience and more sales
You can always mix and match Reserved and On-Demand Instances and reduce your cost whenever required. Reserved Instances can be purchased anytime

Periodic Burst

In this scenario, you have some constant traffic to your website but periodically there are some spikes in the traffic. For example, every quarter there can be some sales promotion. Or during thanks giving or Christmas you will have more traffic to the website. During other months, the traffic to the website will be low. Here's the cost comparison for such a scenario

Source: AWS Whitepaper quoted above

You will spin up On-Demand Instances to start with. You will run couple of them for web servers and couple of them for the database
During the burst period, you will need additional capacity to meet the burst in traffic. You need to spin up additional Instances for your web tier and application tier to meet the demand
Here is where you will enjoy the benefits of on demand provisioning. This is something that is not possible in on-premise hosting. In on-premise hosting, you will purchase the excess capacity required well ahead and keep running them even though the traffic is less. With on demand provisioning, you will only provision them during the burst period. Once the promotion is over, you can terminate those extra capacity
For the capacity that you will always run as the baseline, you can purchase Reserved Instances and reduce the cost up to 75%
Even if you do not purchase Reserved Instances, you can run On-Demand Instances and save around 40% against on-premise infrastructure. Because, for the periodic burst requirement, you can purchase only during the burst period and turn off later. This is not possible in an on-premise setup where you will anyways purchase this ahead of time

Always Unpredictable

In this case, you have an application where you cannot predict the traffic all the time. For example, a social application that is in experimental stage and you expect it to go viral. If it goes viral and gains popularity you will need to expand the infrastructure quickly. If it doesn't, then you do not want to risk a heavy cap-ex. Here's the cost comparison for such a scenario

Source: AWS Whitepaper quoted above

You will spin up On-Demand Instances and scale them according to the traffic
You will use automation tools such as AutoScaling to scale the infrastructure on demand. You can align your infrastructure setup according to the traffic
Over a 3 year period, there will be some initial steady growth of the application. As the application goes viral you will need to add capacity. And beyond its lifetime of say, 18 months to 2 years the traffic may start to fall
Through monitoring tools such as CloudWatch you can constantly tweak the infrastructure and arrive at a baseline infrastructure. You will figure out that during the initial growth and "viral" period you will need certain baseline servers. You can go ahead and purchase Reserved Instances for them and mix them with On-Demand Instances when you scale them. You will enjoy a cost saving benefit of around 70% against on-premise setup
It is not advisable to plan for full capacity and run at full capacity. Or purchase full Reserved Instances for the full capacity. If the application doesn't go well as anticipated, you may end up paying more to AWS than the actual

As you can see, whether you have a steady state application or trying out a new idea, AWS proves advantageous from different perspectives for different requirements. Cost, on-demand provisioning, scalability, flexibility, automation tools are things that even a startup can think off and get on board quickly. One question that you need to ask yourself is "Why am I moving in to AWS?". Ask this question during the early stages and spend considerable time in design and architecture for the infrastructure setup. Otherwise, you may end up asking yourself "Why did I come into AWS?".

Pages

Monday, August 27, 2012

AWS updates that makes architects happy

Monday, August 20, 2012

The AdWantageS