I'm looking for a simple process-based parallel map for python, that is, a function
parmap(function,[data])
that would run function on each element of [data] on a different process (well, on a different core, but AFAIK, the only way to run stuff on different cores in python is to start multiple interpreters), and return a list of results.
Does something like this exist? I would like something simple, so a simple module would be nice. Of course, if no such thing exists, I will settle for a big library :-/
I seems like what you need is the map method in multiprocessing.Pool():
map(func, iterable[, chunksize])
A parallel equivalent of the map() built-in function (it supports only one iterable argument though). It blocks till the result is ready. This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integ
For example, if you wanted to map this function:
def f(x):
return x**2
to range(10), you could do it using the built-in map() function:
map(f, range(10))
or using a multiprocessing.Pool() object's method map():
import multiprocessing
pool = multiprocessing.Pool()
print pool.map(f, range(10))