multiprocessing.Pool in jupyter notebook works on linux but not windows

user1999728 picture user1999728 · May 8, 2016 · Viewed 20.6k times · Source

I'm trying to run a few independent computations (though reading from the same data). My code works when I run it on Ubuntu, but not on Windows (windows server 2012 R2), where I get the error:

'module' object has no attribute ...

when I try to use multiprocessing.Pool (it appears in the kernel console, not as output in the notebook itself)

(And I've already made the mistake of defining the function AFTER creating the pool, and I've also corrected it, that's not the problem).

This happens even on the simplest of examples:

from multiprocessing import Pool
def f(x):
    return x**2
pool = Pool(4)
for res in pool.map(f,range(20)):
    print res

I know that it needs to be able to import the module (and I have no idea how this works when working in the notebook), and I've heard of IPython.Parallel, but I have been unable to find any documentation or examples.

Any solutions/alternatives would be most welcome.

Answer

GRAYgoose124 picture GRAYgoose124 · Dec 23, 2016

I would post this as a comment since I don't have a full answer, but I'll amend as I figure out what is going on.

from multiprocessing import Pool

def f(x):
    return x**2

if __name__ == '__main__':
    pool = Pool(4)
    for res in pool.map(f,range(20)):
        print(res)

This works. I believe the answer to this question is here. In short, the subprocesses do not know they are subprocesses and are attempting to run the main script recursively.

This is the error I am given, which gives us the same solution:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.