Requesting nodes by numbers and their names in SGE

Jayavant picture Jayavant · Jun 22, 2012 · Viewed 19.1k times · Source
  1. How to request the number of nodes (not procs), while job submission in SGE?

    for e.g. In TORQUE, we can specify qsub -l nodes=3

  2. How to request the nodes by their names in SGE?

    for e.g. In TORQUE, we can do this by qsub -l nodes=abc+xyz+pqr, where abc, xyz and pqr are hostnames

    For single hostname, qsub -l hostname=abc it works. But how do I delimit multiple hostnames in SGE?

Answer

Daniel picture Daniel · Jul 10, 2012

Requesting the number of nodes with Grid Engine is done indirectly. When you want to submit a parallel job then you have to request a parallel environment (man sge_pe) together with the amount of slots (processors etc) like qsub -pe mytestpe 12...

Depending on the allocation_rule defined in the parallel environment (qconf -sp mytestpe) the slots are distributed over one or more nodes. If you have a so called fixed allocation rule where you just add a certain number as allocation rule like 4 (4 slots per host) it is easy. If you like one host just submit with -pe mytestpe 4 if you want 10 nodes just submit with -pe mytestpe 40.

Node name can be requested by the -l h=abc. Since node names are RESTRINGS (regular expression strings) in Grid Engine you can create a regular expression for host filtering: qsub -l h="abc|xyz". You can also create host groups (qconf -ahgrp) and request so called queue domains (qsub -q all.q@@mygroup).

Daniel

http://www.gridengine.eu