I am trying to understand how Amazon implements the auto scaling feature. I can understand how it is triggered but I don't know what exactly happens during the auto scaling. How does it expand. For instance,
If I set the triggering condition as cpu>90. Once the vm's cpu usage increases above 90:
I understand that it has the capability to provide load balancing between the VMs. But, I cannot find any links/paper which explains how Amazon auto scaling works. It will be great if you can provide me some information regarding the same. Thank you.
Essentially, in the set up you register an AMI, and a set of EC2 start parameters - a launch configuration (Instance size, userdata, security group, region, availability zone etc) You also set up scaling policies.
At this point, an instance is started which is a combination of the AMI and the launch configuration. It registers itself with an IP address into the AWS environment.
As part of the initial startup (done by ec2config or ec2run - going from memory here) - the newly starting instance can connect to instance meta data and run the script stored in "userdata". This script can bootstrap software installation, operating system configuration, settings, anything really that you can do with a script.
Once it's completed, you've got a newly created instance.
Now - if this process was kicked off by auto-scale and elastic-load-balancing, at the point that the instance is "Windows is ready" (Check ec2config.log), the load balancer will add the instance to itself. Once it's responding to requests, it will be marked healthy, and the ELB will start routing traffic.
The gold standard is to have a generic AMI, and use your bootstrap script to install all the packages / msi's / gems or whatever you need onto the server. But what often happens is that people build a golden image, and register that AMI for scaling.
The downside to the latter method is that every release requires a new AMI to be created, and the launch configurations to be updated.
Hope that gives you a bit more info.