I am trying to wrap my head around Apache Mesos and need clarification on a few items.
My understanding of Mesos is that it is an executable that gets installed on every physical/VM server ("node") in a cluster, and then provides a Java API (somehow) that treats each individual node as a collective pool of computing resources (CPU/RAM/etc.). Hence, to programs coding against the Java API, they only see 1 single set of resources, and don't have to worry about how/where the code is deployed.
So for one, I could be fundamentally wrong in my understanding here (in which case, please correct me!). But if I'm on target, then how does the Java API (provided by Mesos) allow Java clients to tap into these resources?!? Can someone give a concrete example of Mesos in action?
Take a look at my awful drawing below. If I understand the Mesos architecture correctly, we have a cluster of 3 physical servers (phys01
, phys02
and phys03
). Each of these physicals is running an Ubuntu host (or whatever). Through a hypervisor, say, Xen, we can run 1+ VMs.
I am interested in Docker & CoreOS, so I'll use those in this example, but I'm guessing the same could apply to other non-container setups.
So on each VM we have CoreOS. Running on each CoreOS instance is a Mesos executable/server. All Mesos nodes in a cluster see everything underneath them as a single pool of resources, and artifacts can be arbitrarily deployed to the Mesos cluster and Mesos will figure out which CoreOS instance to actually deploy them to.
Running on top of Mesos is a "Mesos framework" such as Marathon or Kubernetes. Running inside Kubernetes are various Docker containers (C1
- C4
).
Is this understanding of Mesos more or less correct?
Your summary is almost right but it does not reflect the essence of what mesos represents. The vision of mesosphere, the Company behind the project, is to create a "Datacenter Operating System" and the mesos is the kernel of it in analogy to the kernel of a normal OS. The API is not limited to Java, you can use C, C++, Java/Scala, or Python. If you have set-up your mesos cluster, as you describe in your question and want to use your resources, you usually do this through a framework instead of running your workload directly on it. This doesn't mean that this is complicated here is a very small example in Scala which demonstrates this. Frameworks exist for multiple popular distributed data processing systems like Apache Spark, Apache Cassandra. There are other frameworks such as Chronos a cron on data center level or Marathon which allows you to run Docker based applications.
Update:
Yes, mesos will take care about the placement in the cluster as that's what a kernel does -- scheduling and management of limited resources. The setup you have sketched raises several obvious questions, however.
Layers below mesos: Installing mesos on CoreOS is possible but cumbersome as I think. This is not a typical scenario for running mesos -- usually it is moved to the lowest possible layer (above Ubuntu in your case). So I hope you have good reasons for running CoreOS and a hypervisor.
Layers above mesos: Kubernetes ist available as framework and mesosphere seems to put much effort in it. It is, however, without question that there are partly overlapping in terms of functionality -- especially with regard to scheduling. If you want to schedule basic workloads based on Containers, you might be better off with Marathon or in the future maybe Aurora. So also here I hope you have good reasons for this very arrangement. Sidenote: Kubernetes is similar to Marathon with a broader approach and pretty opinionated.