MPICH2 on multiple machines (HYDU_sock_connect error)

montekristo_07 picture montekristo_07 · Nov 16, 2013 · Viewed 7.3k times · Source

I am trying to execute an MPI program in 2 different PCs. However, when I ran this command in pc1:

mpirun -hosts user@host -n 4 bin/Demo_01.exe 

I'm getting this error:

[proxy:0:0@pc2] HYDU_sock_connect (./utils/sock/sock.c:203): unable to connect from "pc2" to "pc1" (Connection refused)

[proxy:0:0@pc2] main (./pm/pmiserv/pmip.c:209): unable to connect to server ubuntu at port 57395 (check for firewalls!)

Although I configured SSH connections as without password and disabled firewalls on each machines, the error is still there. My operating system is Ubuntu 12.04 and mpi is MPICH2.

Is there anyone to help?

Answer

brotich picture brotich · May 13, 2016

the error is caused by the the client not connecting back to server as it doesnt know the ip of the server i.e ..main (./pm/pmiserv/pmip.c:209): unable to connect to server ubuntu at...etc

the fix is to add each of hostname and related ip in the /etc/hosts i.e

172.17.0.2  master
172.17.0.3  node1
172.17.0.4  node2

this should allow for bi-directional communication of the master and the node clients