Monday, May 21, 2012

The thin line between IaaS and PaaS

Over the years Cloud Computing has been distinguished by three models of offering namely Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). The level of abstraction is in the same order where SaaS simply offers the software you can subscribe and start consuming it. SaaS is more for the end consumers. PaaS is more for developers where it abstracts the underlying infrastructure and you simply write code for the platform and rest is taken care by the platform service provider. In Facebook terms, you stay focused and keep shipping your code. Of course, platforms are tied to programming languages since the provider cannot provide the same level of abstraction for all languages. IaaS is further down the stack where the provider abstracts the underlying hardware but still provides the infrastructure components (such as compute, storage) as fundamental building blocks. You can then assemble them as per your needs. You need not worry about maintaining the infrastructure but still need to manage it actively.

The two main players in this space are Amazon Web Services (AWS) providing IaaS and Microsoft Azure providing .NET PaaS. OpenStack is also an emerging IaaS player but has a long way to catch up with AWS. You also have Google App Engine providing python and Java PaaS. Customers choose which way to go depending upon their needs. And when it comes to Cloud Computing AWS and Azure are the options that most of the customers evaluate. Though AWS and Azure are fundamentally different in their offerings, the line between IaaS and PaaS, off late, is shrinking. If you look at AWS, it primarily offers compute and storage as a service. But some of their other services is what makes this space very interesting.

Amazon RDS - this is the relational database service from AWS. You can provision MySQL, SQL Server and Oracle RDS Instances and start using them on a pay-as-you-go model. The entire database service is completely managed by Amazon. You don't need to worry about patching the OS or database; it is taken care by Amazon themselves. It also comes up with high availability and automated backups. And to use them, you don't need to change any single line of your existing code base. Just provision the database, import your data and change the database connection string in your app.

Amazon DynamoDB/SimpleDB - these are non relational or NoSQL database service from AWS. These are primarily key value stores offering read and write scalability. To use them, one needs to have an API level integration which means writing specific code to start using them.

AWS Elastic Beanstalk - Now this is a PaaS like offering from AWS. You write your code and generate the final deployment artifact. You then use Elastic Beanstalk to build the infrastructure that is required to host your application. You can do so from your IDE and using the toolkit (AWS Toolkit for Visual Studio or AWS Toolkit for Eclipse) you can manage your application deployment. This is very similar to a PaaS offering but with Elastic Beanstalk you still have control over the underlying infrastructure. You can modify or remove or add the elements that are powering up the infrastructure and still enjoy the benefits of abstraction that Elastic Beanstalk provides.

AWS Elastic Cache - this is a caching service available from AWS. A fully managed cache service built on top of EhCache. All you do is to provision and change the cache provider URL in your application to EhCache's. Rest is taken care by AWS.

Flip over to Azure

Now all of these services in addition to many others are available in Azure as well. But your choice of programming language has to be .NET. I haven't extensively worked with Azure but my understanding of Azure is that you would create an "Azure Project" in Visual Studio or convert an existing .NET project into an Azure Project. You will then have all the components - such as database (SQL Server), caching service, ESB within the project in Visual Studio itself. You will go ahead and build your application on top of these components and when you are ready to go live you will deploy them from Visual Studio itself. The rest of it is completely managed by Microsoft in terms of provisioning the required resources, managing them and maintaining them.

Some of the AWS services that we saw above reflects the same. Specifically Elastic Beanstalk. Beanstalk is very similar to Azure. But on closer look it actually is a wrapper around the fundamental blocks that AWS offers. Since AWS started its offering through Infrastructure components, it is fairly simple for them to build Beanstalk. They have come up with support for Java, .NET and PHP by simply wrapping over the necessary components. For example, Beanstalk for Java would be a wrapper over EC2 running Tomcat plus MySQL on RDS which are available to you as a normal service as well. Similarly, Beanstalk for .NET means Windows EC2 running IIS and SQL Server on RDS. Again available as a normal service from AWS. Like this, AWS can go ahead and build support for other languages such as RoR, python, etc...

And other services like DynamoDB, RDS, Elastic Cache, Simple Email Service (SES) can be called as services. But they already abstract more than what you expect from them. RDS is not a pure infrastructure building block by definition. It is one level up where an entire database is offered as a service, managed by AWS with automated backups and restoration points.

AWS continues to offer infrastructure components that it originally started off with. You will probably have more and more infrastructure components coming out of AWS and you can continue to build your infrastructure the way you wish to. It also offers pure play PaaS like solutions such as Elastic Beanstalk. It might continue to add support for more and more programming languages in coming days. In between these, you have services like RDS, Elastic Cache which are not raw infrastructure service and also not a PaaS like service. They are somewhere in between where certain overheads are taken away from you and you will still have some element of control. On the other hand, Azure, in addition to .NET, now offers support for Java, PHP and node.js. It might continue to evolve as a pure play PaaS solution but offering support for more languages in coming days.

Both AWS and Microsoft see a change in the way their services are consumed. What they started off with is not what their customers really wanted. And the line between their two different models of offering is slowly narrowing down. Both of them might evolve into a hybrid solution providers but might continue to remain as an IaaS and PaaS provider respectively. Again, it all depends on what the customers need end of the day and where majority of the customers are willing to place their bet. Whatever form they continue to offer, Cloud Computing is here to stay.

Wednesday, May 9, 2012

Amazon RDS for SQL Server

Amazon Relational Database Service (RDS) now supports Microsoft SQL Server. With this, RDS now supports the three major databases, viz, MySQL, Oracle and MS SQL Server. RDS for SQL Server brings the same Automated backup, Multi-AZ features available with MySQL and Oracle RDS and it is entirely managed by AWS. Developers can now simply setup a Highly Available and Scalable (manually) SQL Server with few clicks through the AWS Management Console. The entire database management overhead is taken away since AWS will do that work behind the scenes. One need not worry about disks failing, installing patches, etc...

Of course, every one saw that this was coming sometime soon. Most of the enterprise deployments have SQL Server as their relational database and it is a natural move for AWS to tap in to their enterprise customers (who have started utilizing public cloud). One would appreciate this more with a little drive to the past.

July 2011 - this is when Microsoft extended Microsoft License Mobility or Bring Your Own Licenses (BYOL) to AWS. Prior to this, AWS did not have an official partnership with Microsoft to bring licensed products into AWS. Different Microsoft products had different licensing models and within a product itself you will have different licensing models. Not all of the licensing models work fine with AWS. For example, some of the licenses were CPU based and some were tied to the host name. Customers that I had worked with had purchased multi-year licenses already and they wanted to use those licenses in the Cloud. Of course, the perception that they had was that AWS is offering Infrastructure and they can simply rent a server and use their own licenses. I as a consultant had to find out what type of licenses customers had and validate if they worked well with AWS. One can do such a validation only by actually doing it. And it doesn't stop with just installing once. One has to verify if you build out a server image out of the Installation and relaunch it (of course you need to be prepared if your EC2 Instance goes down) whether it works fine.

Did you know that there is a separate way to bring your Microsoft Bizspark licenses? Licenses. Uff.

I personally had such problems with SQL Server setup. Many of the enterprise customers I had worked with invariably had SQL Server as the relational database. With many available licensing models and the different editions, it was a nightmare to setup SQL Server on an EC2 Instance. Most likely, the database Instance will not start on the first attempt. If you happen to restart the EC2 Instance after installation, SQL server might report an error about an invalid host name. That' when I probably would realize that I forgot to set in EC2 Config Service to retain host name on restart.

Now all these frustrations went away when AWS introduced Microsoft Windows Running SQL Server. And it had support for SQL Server Express, Web and Standard editions (2005,2008,2012). With this option, one can directly launch a Windows EC2 Instance with SQL Server pre-installed and configured. This took away all the worries associated with installing and configuring a SQL Server. And one moved to a pay-as-you-go model since the licensing is tied to per hour Instance cost. Of course, if some one had purchased a multi-year license, they still need to come through BYOL. Though, the installation part is taken care by AWS, this option still did not solve the following problems:

  • High Availability - We cannot setup SQL Server Clustering on EC2. Only Mirroring, Replication and Log Shipping are possible
  • Backup - setup backup on the SQL Server Instance and have additional scripts to move the backup to Amazon S3
  • Patches/Updates - completely managed by us
With RDS for SQL Server, AWS addresses the above concerns. One gets a SQL Server database Instance automated backups and completely managed by AWS. We can manually scale up (one click) the database if we need to go for a higher capacity without the need to re-install and re-configure. The database can also be restored to any of the automated backup on a single click (through AWS Management Console).

What's not available?
Currently RDS for SQL Server does not provide the following:
  • Multi-AZ - an option that is available for both RDS for MySQL and RDS for Oracle. A standby database Instance runs on another Availability Zone and in case of any failure at the primary, RDS will initiate an automatic fail over. There is no manual intervention though there will be about 5 minutes of downtime seen by the application tier. If Amazon can offer the same for SQL Server as well, then the offering becomes a killer
  • Read Replicas - an option available only for MySQL RDS and not in Oracle and SQL Server RDS. Most of the web applications are read intensive. This feature can be used to scale out multiple read replicas on demand and increase the database read throughput.
  • VPC - Only MySQL RDS can be provisioned with a VPC. Oracle and SQL Server RDS Instances cannot be setup in VPC. Something that enterprise customers will love to have
I am sure AWS is already working on bringing out all these features and it is only a matter of time till these become available for SQL Server as well. Considering the pace at which they are adding and enhancing services, it isn't very far.

Sunday, May 6, 2012

Amazon Web Services Public IPs

Last week, Amazon Web Services (AWS) expanded its list of public IP ranges in all of its US datacenters (regions in AWS). Looking at the available public IP (Elastic IP in AWS terms) ranges, one can make a simple calculation to arrive at the total number of public IP addresses that AWS has. Subtracting the network bits from 32 and raise 2 to that power. Subtract 2 from the result for network and broadcast addresses. For example, a "/19" network has 2 ^ (32-19) - 2 = 8190 unique IP addresses. In other words 8190 hosts can be setup in that network. Here is a list expanded to all the public IP ranges that AWS:

EC2 IP Addresses

If you sum it up, there are 1982384 public IP addresses with AWS across all of its datacenters. That's close to 2 Million public IP addresses. These IP addresses, apart from available to customers as Elastic IP addresses, could be used by AWS itself for various of its services such as Elastic Load Balancer, Route 53. Hence the number of servers that AWS datacenter has could be far less than this. There is an interesting article that tries to estimate the total number of servers through a different approach.

Thursday, May 3, 2012

Hello World

This is my attempt at writing on technology. I run another personal blog ( where I primarily rant about movies, music and life in general. About that part of life where work doesn't interfere. The last four years of my career have been an immense journey for me. Probably one of the best things that happened to me is to quit a mainstream job and join a startup. I am working in the area of Cloud Computing for the last four years and specifically on Amazon Web Services. The focus for initial few years was on building a product on top of Cloud Computing. Recently, I am heavily involved in Architecture Consulting and Solution Design in the Cloud solving large scale problems for customers in the media, e-commerce, airlines and gaming industry.

Working with such kind of problems, I get exposed to variety of problem set which requires tailor made solutions. And of course on the Open Source platform. In this blog, I will try to write about such experiences, any insights that I can share and general updates from the technology world and my perspective. Certain areas that I am planning to focus on include

  • Cloud Computing
  • Amazon Web Services
  • Infrastructure Architecture
  • Scalability & High Availability
  • Open Source Technology

I am really excited to start something like this and I hope that I continue to write often and keep this blog alive and kicking.