I'm a multiprocessing newbie,
I know something about threading but I need to increase the speed of this calculation, hopefully with multiprocessing:
Example Description: sends string to a thread, alters string + benchmark test, send result back for printing.
from threading import Thread class Alter(Thread): def __init__(self, word): Thread.__init__(self) self.word = word self.word2 = '' def run(self): # Alter string + test processing speed for i in range(80000): self.word2 = self.word2 + self.word # Send a string to be altered thread1 = Alter('foo') thread2 = Alter('bar') thread1.start() thread2.start() #wait for both to finish while thread1.is_alive() == True: pass while thread2.is_alive() == True: pass print(thread1.word2) print(thread2.word2)
This is currently takes about 6 seconds and I need it to go faster.
I have been looking into multiprocessing and cannot find something equivalent to the above code. I think what I am after is pooling but examples I have found have been hard to understand. I would like to take advantage of all cores (8 cores) multiprocessing.cpu_count()
but I really just have scraps of useful information on multiprocessing and not enough to duplicate the above code. If anyone can point me in the right direction or better yet, provide an example that would be greatly appreciated. Python 3 please
Just replace threading
with multiprocessing
and Thread
with Process
. Threads in Pyton are (almost) never used to gain performance because of the big bad GIL! I explained it in an another SO-post with some links to documentation and a great talk about threading in python.
But the multiprocessing module is intentionally very similar to the threading module. You can almost use it as an drop-in replacement!
The multiprocessing module doesn't AFAIK offer a functionality to enforce the use of a specific amount of cores. It relies on the OS-implementation. You could use the Pool object and limit the worker-onjects to the core-count. Or you could look for an other MPI library like pypar. Under Linux you could use a pipe under the shell to start multiple instances on different cores