I need to write a scientific application in C++ doing a lot of computations and using a lot of memory. I have part of the job but due to high requirements in terms of resources I was thinking to start moving to OpenMPI.
Before doing that I have a simple curiosity: If I understood the principle of OpenMPI correctly it is the developer that has the task of splitting the jobs over different nodes calling SEND and RECEIVE based on node available at that time.
Do you know if it does exist some library or OS or whatever that has this capability letting my code reamain as it is now? Basically something that connects all computers and let share as one their memory and CPU?
I am a bit confused because of the huge volume of material available on the topic. Should I look at cloud computing? or Distributed Shared Memory?
Currently there is no C++ library or utility that will allow you to automatically parallelize your code across a cluster of machines. Granted that there are a lot of ways to achieve distributed computing with other approaches, you really want to be optimizing your application to use message passing or distributed shared memory.
Your best bets would be to:
Implementing a parallel distributed solution is one thing, making it work efficiently is another though. Read up on different topologies and different parallel computing patterns to make implementing solutions a little less painful than if you had to start from scratch.