What's the difference between using map
and map_async
? Are they not running the same function after distributing the items from the list to 4 processes?
So is it wrong to presume both are running asynchronous and parallel?
def f(x):
return 2*x
p=Pool(4)
l=[1,2,3,4]
out1=p.map(f,l)
#vs
out2=p.map_async(f,l)
There are four choices to mapping jobs to processes. You have to consider multi-args, concurrency, blocking, and ordering. map
and map_async
only differ with respect to blocking. map_async
is non-blocking where as map
is blocking
So let's say you had a function
from multiprocessing import Pool
import time
def f(x):
print x*x
if __name__ == '__main__':
pool = Pool(processes=4)
pool.map(f, range(10))
r = pool.map_async(f, range(10))
# DO STUFF
print 'HERE'
print 'MORE'
r.wait()
print 'DONE'
Example output:
0
1
9
4
16
25
36
49
64
81
0
HERE
1
4
MORE
16
25
36
9
49
64
81
DONE
pool.map(f, range(10))
will wait for all 10 of those function calls to finish so we see all the prints in a row.
r = pool.map_async(f, range(10))
will execute them asynchronously and only block when r.wait()
is called so we see HERE
and MORE
in between but DONE
will always be at the end.