What are some scenarios for which MPI is a better fit than MapReduce?

parallel-processing distributed mapreduce mpi

Igor ostrovsky · Oct 7, 2009 · Viewed 11.5k times · Source

As far as I understand, MPI gives me much more control over how exactly different nodes in the cluster will communicate.

In MapReduce/Hadoop, each node does some computation, exchanges data with other nodes, and then collates its partition of results. Seems simple, but since you can iterate the process, even algorithms like K-means or PageRank fit the model quite well. On a distributed file system with locality of scheduling, the performance is apparently good. In comparison, MPI gives me explicit control over how nodes send messages to each other.

Can anyone describe a cluster programming scenario where the more general MPI model is an obvious advantage over the simpler MapReduce model?

Answer

Almost any scientific code -- finite differences, finite elements, etc. Which kind of leads to the circular answer, that any distributed program which doesn't easily map to MapReduce would be better implemented with a more general MPI model. Not sure that's much help to you, I'll downvote this answer right after I post it.

What are some scenarios for which MPI is a better fit than MapReduce?

Answer

Related questions