Using subprocess with select and pty hangs when capturing output

ravenac95 picture ravenac95 · Jun 23, 2012 · Viewed 8.4k times · Source

I'm trying to write a python program that is able to interact with other programs. That means sending stdin and receiving stdout data. I cannot use pexpect (although it definitely inspired some of the design). The process I'm using right now is this:

  1. Attach a pty to the subprocess's stdout
  2. Loop until the subprocess exits by checking subprocess.poll
    • When there is data available in the stdout write that data immediately to the current stdout.
  3. Finish!

I've been prototyping some code (below) which works but seems to have one flaw that is bugging me. After the child process has completed, the parent process hangs if I do not specify a timeout when using select.select. I would really prefer not to set a timeout. It just seems a bit dirty. However, all the other ways I've tried to get around the issue don't seem to work. Pexpect seems to get around it by using os.execv and pty.fork instead of subprocess.Popen and pty.openpty a solution I do not prefer. Am I doing something wrong with how I check for the life of the subprocess? Is my approach incorrect?

The code I'm using is below. I'm using this on a Mac OS X 10.6.8, but I need it to work on Ubuntu 12.04 as well.

This is the subprocess runner runner.py:

import subprocess
import select
import pty
import os
import sys

def main():
    master, slave = pty.openpty()

    process = subprocess.Popen(['python', 'outputter.py'], 
            stdin=subprocess.PIPE, 
            stdout=slave, stderr=slave, close_fds=True)

    while process.poll() is None:
        # Just FYI timeout is the last argument to select.select
        rlist, wlist, xlist = select.select([master], [], [])
        for f in rlist:
            output = os.read(f, 1000) # This is used because it doesn't block
            sys.stdout.write(output)
            sys.stdout.flush()
    print "**ALL COMPLETED**"

if __name__ == '__main__':
    main()

This is the subprocess code outputter.py. The strange random parts are just to simulate a program outputting data at random intervals. You can remove it if you wish. It shouldn't matter:

import time
import sys
import random

def main():
    lines = ['hello', 'there', 'what', 'are', 'you', 'doing']
    for line in lines:
        sys.stdout.write(line + random.choice(['', '\n']))
        sys.stdout.flush()
        time.sleep(random.choice([1,2,3,4,5])/20.0)
    sys.stdout.write("\ndone\n")
    sys.stdout.flush()

if __name__ == '__main__':
    main()

Thanks for any help you all can provide!

Extra note

pty is used because I want to ensure that stdout isn't buffered.

Answer

Antti Haapala picture Antti Haapala · Sep 1, 2012

First of all, os.read does block, contrary to what you state. However, it does not block after select. Also os.read on a closed file descriptor always returns an empty string, that you might want to check for.

The real problem however is that the master device descriptor is never closed, thus the final select is the one that will block. In a rare race condition, the child process has exited between select and process.poll() and your program exits nicely. Most of the time however the select blocks forever.

If you install the signal handler as proposed by izhak all hell breaks loose; whenever a child process is terminated, the signal handler is run. After the signal handler is run, the original system call in that thread cannot be continued, so that syscall invocation returns nonzero errno, which often results in some random exception being thrown in python. Now, if elsewhere in your program you use some library with any blocking system calls that do not know how to handle such exceptions, you are in a big trouble (any os.read for example anywhere can now throw an exception, even after a successful select).

Weighing having random exceptions thrown anywhere against polling a bit, I don't think the timeout on select does not sound that bad idea. Your process would still hardly be the only (slow) polling process on the system anyway.