How does Amazon EC2 Auto Scaling work?

sethu picture sethu · Sep 25, 2011 · Viewed 7k times · Source

I am trying to understand how Amazon implements the auto scaling feature. I can understand how it is triggered but I don't know what exactly happens during the auto scaling. How does it expand. For instance,

If I set the triggering condition as cpu>90. Once the vm's cpu usage increases above 90:

  1. Does it have a template image which will be copied to the new machine and started?
  2. How long will it take to start servicing the new requests ?
  3. Will the old vm have any downtime ?

I understand that it has the capability to provide load balancing between the VMs. But, I cannot find any links/paper which explains how Amazon auto scaling works. It will be great if you can provide me some information regarding the same. Thank you.

Answer

Pete - MSFT picture Pete - MSFT · Dec 15, 2012

Essentially, in the set up you register an AMI, and a set of EC2 start parameters - a launch configuration (Instance size, userdata, security group, region, availability zone etc) You also set up scaling policies.

  1. Your scaling trigger fires
  2. Policies are examined to determine which launch configuration pplies
  3. ec2 run instance is called with the registered AMI and the launch configuration parameters.

At this point, an instance is started which is a combination of the AMI and the launch configuration. It registers itself with an IP address into the AWS environment.

As part of the initial startup (done by ec2config or ec2run - going from memory here) - the newly starting instance can connect to instance meta data and run the script stored in "userdata". This script can bootstrap software installation, operating system configuration, settings, anything really that you can do with a script.

Once it's completed, you've got a newly created instance.

Now - if this process was kicked off by auto-scale and elastic-load-balancing, at the point that the instance is "Windows is ready" (Check ec2config.log), the load balancer will add the instance to itself. Once it's responding to requests, it will be marked healthy, and the ELB will start routing traffic.

The gold standard is to have a generic AMI, and use your bootstrap script to install all the packages / msi's / gems or whatever you need onto the server. But what often happens is that people build a golden image, and register that AMI for scaling.

The downside to the latter method is that every release requires a new AMI to be created, and the launch configurations to be updated.

Hope that gives you a bit more info.