Python threading multiple bash subprocesses?

Andrew picture Andrew · Jan 26, 2013 · Viewed 51.5k times · Source

How does one use the threading and subprocess modules to spawn parallel bash processes? When I start threads ala the first answer here: How to use threading in Python?, the bash processes run sequentially instead of in parallel.

Answer

jfs picture jfs · Jan 26, 2013

You don't need threads to run subprocesses in parallel:

from subprocess import Popen

commands = [
    'date; ls -l; sleep 1; date',
    'date; sleep 5; date',
    'date; df -h; sleep 3; date',
    'date; hostname; sleep 2; date',
    'date; uname -a; date',
]
# run in parallel
processes = [Popen(cmd, shell=True) for cmd in commands]
# do other things here..
# wait for completion
for p in processes: p.wait()

To limit number of concurrent commands you could use multiprocessing.dummy.Pool that uses threads and provides the same interface as multiprocessing.Pool that uses processes:

from functools import partial
from multiprocessing.dummy import Pool
from subprocess import call

pool = Pool(2) # two concurrent commands at a time
for i, returncode in enumerate(pool.imap(partial(call, shell=True), commands)):
    if returncode != 0:
       print("%d command failed: %d" % (i, returncode))

This answer demonstrates various techniques to limit number of concurrent subprocesses: it shows multiprocessing.Pool, concurrent.futures, threading + Queue -based solutions.


You could limit the number of concurrent child processes without using a thread/process pool:

from subprocess import Popen
from itertools import islice

max_workers = 2  # no more than 2 concurrent processes
processes = (Popen(cmd, shell=True) for cmd in commands)
running_processes = list(islice(processes, max_workers))  # start new processes
while running_processes:
    for i, process in enumerate(running_processes):
        if process.poll() is not None:  # the process has finished
            running_processes[i] = next(processes, None)  # start new process
            if running_processes[i] is None: # no new processes
                del running_processes[i]
                break

On Unix, you could avoid the busy loop and block on os.waitpid(-1, 0), to wait for any child process to exit.