MPI vs GPU vs Hadoop, what are the major difference between these three parallelism?

user974270 picture user974270 · Apr 19, 2012 · Viewed 16.3k times · Source

I know for some machine learning algorithm like random forest, which are by nature should be implemented in parallel. I do a home work and find there are these three parallel programming framework, so I am interested in knowing what are the major difference between these three types of parallelism?

Especially, if some one can point me to some study compare the difference between them, that will be perfect!

Please list the pros and cons for each parallelism , thanks

Answer

prathmesh.kallurkar picture prathmesh.kallurkar · Apr 20, 2012
  1. MPI is a message passing paradigm of parallelism. Here, you have a root machine which spawns programs on all the machines in its MPI WORLD. All the threads in the system are independent and hence the only way of communication between them is through messages over network. The network bandwidth and throughput is one of the most crucial factor in MPI implementation's performance. Idea : If there is just one thread per machine and you have many cores on it, you can use OpenMP shared memory paradigm for solving subsets of your problem on one machine.

  2. CUDA is a SMT paradigm of parallelism. It uses state of the art GPU architecture to provide parallelisim. A GPU contains (blocks of ( set of cores)) working on same instruction in a lock-step fashion (This is similar to SIMD model). Hence, if all the threads in your system do a lot of same work, you can use CUDA. But the amount of shared memory and global memory in a GPU are limited and hence you should not use just one GPU for solving a huge problem.

  3. Hadoop is used for solving large problems on commodity hardware using Map Reduce paradigm. Hence, you do not have to worry about distributing data or managing corner cases. Hadoop also provides a file system HDFS for storing data on compute nodes.




Hadoop, MPI and CUDA are completely orthogonal to each other. Hence, it may not be fair to compare them.

Though, you can always use ( CUDA + MPI ) to solve a problem using a cluster of GPU's. You still need a simple core to perform the communication part of the problem.