I have experience in coding OpenMP for Shared Memory machines (in both C and FORTRAN) to carry out simple tasks like matrix addition, multiplication etc. (Just to see how it competes with LAPACK). I know OpenMP enough to carry out simple tasks without the need to look at documentation.
Recently, I shifted to Python for my projects and I don't have any experience with Python beyond the absolute basics.
My question is :
What is the easiest way to use OpenMP in Python? By easiest, I mean the one that takes least effort on the programmer side (even if it comes at the expense of added system time)?
The reason I use OpenMP is because a serial code can be converted to a working parallel code with a few !$OMP
s scattered around. The time required to achieve a rough parallelization is fascinatingly small. Is there any way to replicate this feature in Python?
From browsing around on SO, I can find:
Are there more? Which aligns best with my question?
Due to GIL there is no point to use threads for CPU intensive tasks in CPython. You need either multiprocessing (example) or use C extensions that release GIL during computations e.g., some of numpy functions, example.
You could easily write C extensions that use multiple threads in Cython, example.