Both Marathon and Aurora are built on Mesos and supposedly are engineered for running long running services. My questions are:
Thanks!
Disclaimer: I am the VP of Apache Aurora, and have been the tech lead of the Aurora team at Twitter for ~5 years. My likely-biased opinions are my own and do not necessarily represent those of Twitter or the ASF.
Do these frameworks run anything that runs on Linux? For Marathon they state that it can run anything that "is executable in a shell" but this is sort of vague :)
Essentially, yes. Ultimately these systems are sophisticated machinery to execute shell code somewhere in a cluster :-)
What are their differences? I have struggled in finding any good explanations regarding their key differences
Aurora and Marathon do indeed offer similar feature sets, both being classified as "service schedulers". In other words, you hand us instructions for how to run your application servers, and we do our best to keep them up.
I'll offer some differences in broad strokes. When it comes to shortcomings mentioned in each, I think it's safe to say that the communities are aware and intend to fix them.
Ease of use
Aurora is not easy to install. It will likely feel like you are trailblazing while setting it up. It exposes a thrift API, which means you'll need a thrift client to interact with it programmatically (a REST-like API is coming, but is vaporware at the moment), or use our command line client. Aurora has a DSL for configuration which can be daunting, but allows you to easily share templates and common patterns as you use the system more.
Marathon, on the other hand, helps you to run 'Hello World' as quickly as possible. It has great docs to do this in many environments and there's little overhead to get going. It has a REST API, making it easier to adapt to custom tools. It uses JSON for configuration, which is easy to start with but more prone to cargo culting.
Targeted use cases
Aurora has always been designed to handle a large engineering organization. The clusters at Twitter have tens of thousands of machines and hundreds of engineers using them. It is critical to Twitter's business. As a result, we take our requirements of scale, stability, and security very seriously. We make sure to only condone features that we believe are trustworthy at scale in production (for example, we have our Docker support labeled as beta because of known issues with Docker itself and the Mesos-Docker integration). We also have features like preemption that make our clusters suitable for mixing business-critical services with prototypes and experiments.
I can't make any claim for or against Marathon's scalability. On the feature front, Marathon has build out features quickly, but this can feel bleeding edge in practice (Docker support is a good example). This is not always due to Marathon itself, but also layers down the stack. Marathon does not provide preemption.
Ownership
To some, ownership and governance of a project is important. It feel that in practice it does not define the openness of a project, but for some people/companies the legal fine print can be a deal-breaker.
To some, this is beneficial, to others is is not. It means that you can pay for support and features. It also means that there is something to be sold, and the project direction is ultimately decided by Mesosphere's interests.
This means it is subject to the governance model of the ASF, driven by the community. Aurora does not have paying customers, and there is not currently a software shop that you can pay for development.
tl;dr If you are just getting your feet wet with running services on Mesos, I would suggest Marathon as your first port of call. It will be easier for you to get running and poke around the ecosystem. If you are forming the 'private cloud strategy' for a company, I suggest seriously considering Aurora, as it is proven and specifically designed for that.