Spread vs MPI vs zeromq?

Ben Collins picture Ben Collins · Aug 30, 2008 · Viewed 13.1k times · Source

In one of the answers to Broadcast like UDP with the Reliability of TCP, a user mentions the Spread messaging API. I've also run across one called ØMQ. I also have some familiarity with MPI.

So, my main question is: why would I choose one over the other? More specifically, why would I choose to use Spread or ØMQ when there are mature implementations of MPI to be had?

Answer

Todd Gamblin picture Todd Gamblin · Sep 15, 2008

MPI was deisgned tightly-coupled compute clusters with fast, reliable networks. Spread and ØMQ are designed for large distributed systems. If you're designing a parallel scientific application, go with MPI, but if you are designing a persistent distributed system that needs to be resilient to faults and network instability, use one of the others.

MPI has very limited facilities for fault tolerance; the default error handling behavior in most implementations is a system-wide fail. Also, the semantics of MPI require that all messages sent eventually be consumed. This makes a lot of sense for simulations on a cluster, but not for a distributed application.