We're planning to use AMI EC2 instances which are not "pre-baked". I.e. when they are spun up, they are bare installs of AWS linux. Our bootstrap process will pull in the various installs that we need e.g. python, tomcat. We'll have min of 3 instances and max of 8.
Given these requirements, would using Puppet/Chef be useful rather than using Amazon Cloud Formation (CloudInit)?
Best I can see is if we used Puppet, then we'd have declarative programming which is easier to audit to see what's happening versus a script. Also CloudInit has a 16k script size limit which we may or may not run into.
Has anyone moved from CloudInit to Puppet or Chef for a specific reason that they can provide here in answer to my question?
Is there an advantage over CloudInit? Yes, absolutely, many of them!
Sure, you can write top to bottom run once CloudInit scripts to provision a server. But what happens when you need to change a configuration file, add a user, update a package, or install a new package? You will end up logging into servers or writing scripts to do so, and inevitably an incongruous state of servers.
CloudInit is not configuration management. If you opt to begin using configuration management software, use cloud init for just one task: to bootsrap the Puppet/Chef/other agent.
Puppet doesn't just help you automate installing packages, setup ssh keys, or tune your Tomcat heap. It ensures the state of things. When a developer is troubleshooting a Java app at 3am and changes your Tomcat config, Puppet will change it back. You can rapidly change the version of Python for all or groups of nodes, and if someone installs a different version, Puppet will change it back.
When your application stack changes and you start using, say RabbitMQ, or Jetty, or a new RDBMS, you can easily test and deploy the changes across tens or thousands of servers.
There are many other reasons to use configuration management software such as back end reporting, auditing, and security compliance.