Running programs in parallel using xargs

Olivier picture Olivier · Feb 6, 2015 · Viewed 81.4k times · Source

I currently have the current script.

#!/bin/bash
# script.sh

for i in {0..99}; do
   script-to-run.sh input/ output/ $i
done

I wish to run it in parallel using xargs. I have tried

script.sh | xargs -P8

But doing the above only executed once at the time. No luck with -n8 as well. Adding & at the end of the line to be executed in the script for loop would try to run the script 99 times at once. How do I execute the loop only 8 at the time, up to 100 total.

Answer

Etan Reisner picture Etan Reisner · Feb 6, 2015

From the xargs man page:

This manual page documents the GNU version of xargs. xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial- arguments followed by items read from standard input. Blank lines on the standard input are ignored.

Which means that for your example xargs is waiting and collecting all of the output from your script and then running echo <that output>. Not exactly all that useful nor what you wanted.

The -n argument is how many items from the input to use with each command that gets run (nothing, by itself, about parallelism here).

To do what you want with xargs you would need to do something more like this (untested):

printf %s\\n {0..99} | xargs -n 1 -P 8 script-to-run.sh input/ output/

Which breaks down like this.

  • printf %s\\n {0..99} - Print one number per-line from 0 to 99.
  • Run xargs
    • taking at most one argument per run command line
    • and run up to eight processes at a time