Writing a parallel loop

KMA picture KMA · Nov 18, 2015 · Viewed 15.5k times · Source

I am trying to run a parallel loop on a simple example.
What am I doing wrong?

from joblib import Parallel, delayed  
import multiprocessing

def processInput(i):  
        return i * i

if __name__ == '__main__':

    # what are your inputs, and what operation do you want to 
    # perform on each input. For example...
    inputs = range(1000000)      

    num_cores = multiprocessing.cpu_count()

    results = Parallel(n_jobs=4)(delayed(processInput)(i) for i in inputs) 

    print(results)

The problem with the code is that when executed under Windows environments in Python 3, it opens num_cores instances of python to execute the parallel jobs but only one is active. This should not be the case since the activity of the processor should be 100% instead of 14% (under i7 - 8 logic cores).

Why are the extra instances not doing anything?

Answer

Fanchi picture Fanchi · Feb 5, 2016

Continuing on your request to provide a working multiprocessing code, I suggest that you use pool_map (if the delayed functionality is not important), I'll give you an example, if your'e working on python3 its worth to mention you can use starmap. Also worth mentioning that you can use map_sync/starmap_async if the order of the returned results does not have to correspond to the order of inputs.

import multiprocessing as mp

def processInput(i):
        return i * i

if __name__ == '__main__':

    # what are your inputs, and what operation do you want to
    # perform on each input. For example...
    inputs = range(1000000)
    #  removing processes argument makes the code run on all available cores
    pool = mp.Pool(processes=4)
    results = pool.map(processInput, inputs)
    print(results)