Pages

Monday, August 27, 2012

AWS updates that makes architects happy

AWS keeps adding a lot of features to its existing services and also keeps launching new services. With the pace of innovation from AWS and the rate at which new services and updates are released, one may lose track of them pretty easily. As architects it is essential to know all of these all the time so that one proposes the right solution for a given requirement. Here are some of the features and services that has made the lives of architects easy and can help in simpler infrastructure design addressing specific needs. There is no order to this list - I am just listing them down. And here's a summary before we go detail in to each of them


High I/O EC2 Instance Type - This is a new family of Instance types from AWS in addition to the existing High Memory and High CPU Instance types. The first one in this family is hi1.4xlarge which comes with 60GB of memory and 8 virtual cores. The best part of this Instance type is the whopping 2 TB of SSB attached to them. These SSDs are the ephemeral disks that come as part of the Instance and are of course non-persistent. AWS promises around 120K random read IOPS and between 10K and 85K write IOPS. This Instance type is suitable for I/O intensive applications and running NoSQL databases. Here are couple of pointers about this Instance type

  • SSD disks are ephemeral - you will lose data on Instance termination (of course if you stop and start the Instance)
  • Data will be persistent between reboot like any other EC2 Instance
  • You will see two 1TB volumes on the Instance. Can be striped to double the throughput
  • If you are launching a Windows Instance, choose the "Microsoft Windows 2008 R2 64-bit for Cluster Instances" AMI
  • Currently this Instance type is available only in US-East and EU-West regions
  • And its priced at $3.10 an hour :)
I am sure AWS will be releasing lower Instance types (such as High I/O large) since not everyone will be looking to use an 4xlarge Instance. There are already some benchmarks that you can refer here and specifically netflix has performed a detailed benchmark of this Instance type.

Provisioned IOPS for EBS - EBS is great for storing persistent data and the ability to attach and detach it to/from any Instance. But from day zero the performance and throughput has been an issue with EBS. The IO throughput that one gets out of EBS is very poor and all the more it is not consistent. If the network load during that time is not heavy then one might see large throughput from EBS. If not the throughput will hover around 100 IOPS. Fast forward to now and with provisioned IOPS we can set the IOPS that we expect from EBS and AWS will guarantee the IOPS at all times. This is great news for hosting databases on EC2 (if you are not using RDS). Here are couple of pointers on this:
  • Currently the max IOPS that can be set is 1000. AWS will soon increase this limit
  • We can of course drive more throughput by attaching multiple EBS volumes and striping them
  • Costs little extra than a normal EBS Volume - 0.125/GB against 0.10/GB. Of course the benefits will be far more than the cost difference. Guess everyone will eventually move to provisioned IOPS. Nice way to increase AWS revenue :)
I am assuming that provisioned IOPS will be extended to RDS as well in near future. Just like DynamoDB where one can specify the required throughput, AWS should let the user specify the throughput required from RDS. RDS already uses EBS behind the scenes for storing the data and hence this should definitely be in the pipeline. At least I wish so.

EBS-Optimized EC2 Instances - One thing with EBS that everyone understands is that it is network attached. Unlike local ephemeral storage which is directly attached to the EC2 Instance, there is a pipe which transfers all the data that will be written to EBS. So, even if we use provisioned IOPS to increase the throughput from an EBS, the network throughput can be significantly low that we may not utilize the increased EBS throughput; especially when the network is shared between many EC2 Instances. AWS has introduced EBS-Optimized EC2 Instances which have dedicated network throughput to the EBS that are attached to it. An EBS-Optimized m1.large Instance will have about 500 Mbits/s of guaranteed network throughput to EBS.
  • Currently only m1.large, m1.xlarge and m2.4xlarge are available as EBS-Optimized Instances
  • The guaranteed throughput is 500 Mbits/s for m1.large and 1000 Mbits/s for m1.xlarge and m2.4xlarge
  • This should not be confused with the regular network throughput of the EC2 Instance - for example talking to another EC2 Instance. This guaranteed throughput is only for the EBS related traffic. All other traffic is through the other network pipe given to the EC2 Instance (which is not guaranteed and differs by Instance)
  • Comes with additional hourly cost - Optimized m1.large will cost $0.345/hr against a normal on demand m1.large costing $0.320/hr
  • Always use EBS-Optimized Instance when using Provisioned IOPS
Elastic Network Interface (ENI) - This feature is available for Instances launched within Virtual Private Cloud (VPC). When you launch Instances within VPC you normally specify the subnet to launch it in and also specify the private IP address of that EC2 Instance. If the Instance is in a public subnet, you will attach and Elastic IP so that you can access the Instance over the internet. With ENI, you no longer specify a subnet while launching the Instance. You create an ENI (which is attached to a subnet) and attach the ENI to the Instance. So what are the benefits of using ENI?
  • You can attach multiple ENI (belonging to different subnets) to an EC2 Instance
  • One ENI can be configured to have a public IP (Elastic IP) and private IP and the other ENI can be configured to have just a private IP
  • With such a setup, you can allow traffic on port 80 through the first ENI and SSH traffic (port 22) through the other ENI
  • This way you can further secure your VPN architecture by allowing SSH traffic only from your corporate datacenter

Image Courtesy - AWS
Multiple IP Addresses per Instance - with this feature release, AWS addressed the long waited requested for hosting multiple SSL websites on a given Instance. This feature is an extension of  the ENI feature, where each ENI is now allowed to have multiple public and private IPs. We can create an ENI and create multiple Elastic IPs and attach them to the ENI. The ENI will be then attached to the EC2 Instance (during launch or hot attach during running) and the Instance gets multiple IP addresses.
  • The multiple public IP address and private IP addresses are attached to the ENI. And to the Instance when we attach the ENI to the Instance. The difference is that, if we terminate an Instance and relaunch a new one with the same ENI, all the mappings will remain and the new Instance will also get all the public IPs and their private IP mappings
  • Available only within VPC currently
  • Additional cost for each of the additional public IPs that we attach. No cost for the additional private IPs attached
ELB Security Group - When you use an Elastic Load Balancer (ELB), you will obviously place EC2 Instances behind it to accept incoming traffic. For quite a while, on the EC2 Instances Security Group, you will have to open the incoming port (to which ELB will be sending traffic) to all IPs (0.0.0.0/0). This was because of the fact that ELB will scale itself and AWS will not know at a given point of time how many ELB Instances will be running. But now, ELB's come with their own security groups. So on your EC2 Instances (that sit behind the ELB) Security Group, you will have to open up the port to only ELB's security group. This way you secure your EC2 Instances to accept traffic only from ELB.

Internal Load Balancer - You can launch ELB within VPC as an internal load balancer. By this fashion, the ELB's do not have a public IP addresses and will only have private IP addresses. Such a load balancer can be used for internal load balancing requirements between different tiers in the architecture. I wrote a separate article explaining how it can be used.

Custom Security Group of ELB - If you launch ELBs within VPC (either internal or public facing) you have an option of specifying a custom Security Group of your own. This way custom rules can be configured on the ELB's Security Group instead of using the ELB's default Security Group (which cannot be edited).
ELB Custom Security Group

S3 Object Expiration
This feature is great for storing logs and scheduling them to be deleted after a definite period of time. When you have multiple web or application servers you will definitely consider pushing the logs generated on them to S3 for centralized storage. Even if you are running a minimal setup, you certainly do not want run out of space on your Instance. But you do not want to store the logs perpetually - one month old logs will be good enough for most of the applications. Instead of writing custom scripts that periodically deletes the logs stored in S3, use this feature so that Amazon will automatically delete them. Saves some cost.

Route 53 Alias record

Route53 is a scalable DNS service from AWS. With pay-as-you-go pricing, there are whole lot of benefits (in terms of flexibility) with Route53. It is definitely not sophisticated like other DNS service providers in terms of feature set. Be it latency based routing or weighted round robin, the service is updated with features. One feature that will be of interest is "Alias Record". Let's say you have a website called "example.com". And you are hosting the infrastructure in AWS. If your load balancer layer uses AWS Elastic Load Balancing, then you will be provided with a CNAME for the ELB. ELB internally may employ "n" number of servers to scale itself and hence it may have more than one IP. Hence AWS provides the CNAME for an ELB which may resolve to one or more IPs at any time. With that background,


  • You want to create a DNS record for "www.example.com"
  • You also need to create a DNS record for "example.com"; the naked domain or top level domain. Most of the users will use only the naked domain
By design, DNS does not support adding a CNAME entry for the naked domain - simply put you cannot point "http://example.com" to an ELB but you can point "http://www.example.com" to an ELB CNAME. The naked domain can be pointed to only an A-record. Having only A-records for the naked domain will introduce complexity in HA and Scalability for the load balancing layer. And if your load balancing layer employs ELB, there is no way you will know the IPs before hand. One way to address this constraint is to put a separate Apache server which can do the translation. But this will be a serious bottleneck in the architecture since most of the users will use only the naked domain and soon the infrastructure will crumble even though we have build HA and Scalability in all other layers. With the "Alias" feature

  • First create a Route53 record set for "http://www.example.com" and point it to the ELB CNAME
  • Create another record set for "example.com" and set the type as "Alias". Point the record value to the record set created above. Or directly point it to the ELB CNAME
  • With the "Alias" option you will be able to point the record value to either an ELB CNAME or another record set that is available in Route53.
This way, you take care of both naked domain and "www" with both pointing to the load balancing layer in the AWS setup.


Of course there are a whole lot of features and updates that AWS keeps pushing across its services. The above are some of the features that helps architects while designing the infrastructure architecture. These have personally helped us build better architectures. Prior to these features, we always had to look for alternatives such has tuning, custom solutions or sacrifice on certain parameters to achieve others. I am sure there are many more such updates that we can uncover and will be coming from AWS in future. That will be for another post here.

No comments: