python multiprocessing - process hangs on join for large queue

user545424 picture user545424 · Feb 8, 2014 · Viewed 19.3k times · Source

I'm running python 2.7.3 and I noticed the following strange behavior. Consider this minimal example:

from multiprocessing import Process, Queue

def foo(qin, qout):
    while True:
        bar = qin.get()
        if bar is None:
            break
        qout.put({'bar': bar})

if __name__ == '__main__':
    import sys

    qin = Queue()
    qout = Queue()
    worker = Process(target=foo,args=(qin,qout))
    worker.start()

    for i in range(100000):
        print i
        sys.stdout.flush()
        qin.put(i**2)

    qin.put(None)
    worker.join()

When I loop over 10,000 or more, my script hangs on worker.join(). It works fine when the loop only goes to 1,000.

Any ideas?

Answer

Armin Rigo picture Armin Rigo · Feb 8, 2014

The qout queue in the subprocess gets full. The data you put in it from foo() doesn't fit in the buffer of the OS's pipes used internally, so the subprocess blocks trying to fit more data. But the parent process is not reading this data: it is simply blocked too, waiting for the subprocess to finish. This is a typical deadlock.