Problem: I am developing a graphical front end for a distributed CPU/GPU simulator. As this simulator utilizes MPI, it requires a hostfile detailing the hostnames for all computers being used on the network so that it knows what machines to distribute across. As the end users for my application are not computer scientists (and may not even be very computer literate), I can't expect them to know/find the hostnames of every computer on their network/cluster. I would like to programmatically perform this hostname discovery so that, upon application start-up, the user can see the available machines, and from those, pick the hosts they want to run on. If possible, I would like this solution to be cross platform but as the simulator currently contains some linux dependencies I can deal with a Linux only solution.
What I have tried so far: I tried utilizing the nmap package to discover hosts on a network with commands like nmap -sP <ip address range>
using the ip address range for that local network. However, it only dumps the IP addresses for the hosts (not the host names) and I'm not sure how to translate these IP addresses into ssh hostnames (as MPI uses ssh for host discovery). Additionally, I used a similar approach with ping supplying the broadcast address and it returned nearly identical results.
I apologize for the broad nature of this question and the lack of code shown but I am not very experienced with network probing / programming and I am really not even sure where to start. I tried googling this but I was unable to find a suitable option (possibly because my lack of experience caused me to use improper terminology triggering improper results) My background is primarily in graphics and user interface programming, so this is a little beyond my comfort zone.
SSH doesn't care if it is given hostnames or IP addresses to connect to (not sure if this applies when there are host-specific configurations). Most MPI implementations don't care too, e.g. in Open MPI connection URIs addresses are all numeric, so a hostfile with IPs would be fine. HTTP servers on the other hand care because of the virtual hosting thing where many different sites resolve to the same IP address but the server is supplied the actual hostname via the Host
HTTP header.
Unsolicited advice: finding hosts by ping is fine, but it doesn't guarantee that you have found machines, where SSH is running. You would better scan for systems with port 22 open that accept TCP connections:
$ nmap -oX -sT -p22 <ip range>
-oX
produces XML output that can be easily parsed. -oG
is also a nice format for automated parsing of the scan results. Also having SSH running doesn't necessarily mean that the user would be able to log into the system - for example it could be a network router or another remotely manageable device. One also has to take care of only showing machines where the user can log on without having to supply a password, e.g. with RSA/DSA public keys, otherwise starting an MPI job would be a really tedious task. You can test each host found with something like:
$ ssh -2 -o "PreferredAuthentications=gssapi-with-mic,hostbased,publickey" \
<host> hostname
This command basically excludes all interactive authentication methods. If connection succeeds, it will output the hostname of the remote machine. Otherwise you'd get a permission denied error and a non-zero exit code from the SSH client.