Understanding Popen.communicate

Black_Hat picture Black_Hat · May 27, 2013 · Viewed 174.1k times · Source

I have a script named 1st.py which creates a REPL (read-eval-print-loop):

print "Something to print"
while True:
    r = raw_input()
    if r == 'n':
        print "exiting"
        break
    else:
        print "continuing"

I then launched 1st.py with the following code:

p = subprocess.Popen(["python","1st.py"], stdin=PIPE, stdout=PIPE)

And then tried this:

print p.communicate()[0]

It failed, providing this traceback:

Traceback (most recent call last):
  File "1st.py", line 3, in <module>
    r = raw_input()
EOFError: EOF when reading a line

Can you explain what is happening here please? When I use p.stdout.read(), it hangs forever.

Answer

jfs picture jfs · May 27, 2013

.communicate() writes input (there is no input in this case so it just closes subprocess' stdin to indicate to the subprocess that there is no more input), reads all output, and waits for the subprocess to exit.

The exception EOFError is raised in the child process by raw_input() (it expected data but got EOF (no data)).

p.stdout.read() hangs forever because it tries to read all output from the child at the same time as the child waits for input (raw_input()) that causes a deadlock.

To avoid the deadlock you need to read/write asynchronously (e.g., by using threads or select) or to know exactly when and how much to read/write, for example:

from subprocess import PIPE, Popen

p = Popen(["python", "-u", "1st.py"], stdin=PIPE, stdout=PIPE, bufsize=1)
print p.stdout.readline(), # read the first line
for i in range(10): # repeat several times to show that it works
    print >>p.stdin, i # write input
    p.stdin.flush() # not necessary in this case
    print p.stdout.readline(), # read output

print p.communicate("n\n")[0], # signal the child to exit,
                               # read the rest of the output, 
                               # wait for the child to exit

Note: it is a very fragile code if read/write are not in sync; it deadlocks.

Beware of block-buffering issue (here it is solved by using "-u" flag that turns off buffering for stdin, stdout in the child).

bufsize=1 makes the pipes line-buffered on the parent side.