Hadoop MapReduce vs MPI (vs Spark vs Mahout vs Mesos) - When to use one over the other?

GuSuku picture GuSuku · Jan 6, 2015 · Viewed 7.1k times · Source

I am new to parallel computing and just starting to try out MPI and Hadoop+MapReduce on Amazon AWS. But I am confused about when to use one over the other.

For example, one common rule of thumb advice I see can be summarized as...

  • Big data, non-iterative, fault tolerant => MapReduce
  • Speed, small data, iterative, non-Mapper-Reducer type => MPI

But then, I also see implementation of MapReduce on MPI (MR-MPI) which does not provide fault tolerance but seems to be more efficient on some benchmarks than MapReduce on Hadoop, and seems to handle big data using out-of-core memory.

Conversely, there are also MPI implementations (MPICH2-YARN) on new generation Hadoop Yarn with its distributed file system (HDFS).

Besides, there seems to be provisions within MPI (Scatter-Gather, Checkpoint-Restart, ULFM and other fault tolerance) that mimic several features of MapReduce paradigm.

And how does Mahout, Mesos and Spark fit in all this?

What criteria can be used when deciding between (or a combo of) Hadoop MapReduce, MPI, Mesos, Spark and Mahout?

Answer

Aaron Altman picture Aaron Altman · Apr 8, 2015

There might be good technical criteria for this decision but I haven't seen anything published on it. There seems to be a cultural divide where it's understood that MapReduce gets used for sifting through data in corporate environments while scientific workloads use MPI. That may be due to underlying sensitivity of those workloads to network performance. Here are a few thoughts about how to find out:

Many modern MPI implementations can run over multiple networks but are heavily optimized for Infiniband. The canonical use case for MapReduce seems to be in a cluster of "white box" commodity systems connected via ethernet. A quick search on "MapReduce Infiniband" leads to http://dl.acm.org/citation.cfm?id=2511027 which suggests that use of Infiniband in a MapReduce environment is a relatively new thing.

So why would you want to run on a system that's highly optimized for Infiniband? It's significantly more expensive than ethernet but has higher bandwidth, lower latency and scales better in cases of high network contention (ref: http://www.hpcadvisorycouncil.com/pdf/IB_and_10GigE_in_HPC.pdf).

If you have an application that would be sensitive to those effects of optimizations for Infiniband that are already baked into many MPI libraries, maybe that would be useful for you. If your app is relatively insensitive to network performance and spends more time on computations that don't require communication between processes, maybe MapReduce is a better choice.

If you have the opportunity to run benchmarks, you could do a projection on whichever system you have available to see how much improved network performance would help. Try throttling your network: downclock GigE to 100mbit or Infiniband QDR to DDR, for example, draw a line through the results and see if the purchase of a faster interconnect optimized by MPI would get you where you want to go.