Is concurrent.futures a medicine of the GIL?

Abdelouahab Pp picture Abdelouahab Pp · Feb 21, 2013 · Viewed 7k times · Source

I was just searching about this new implementation, and i use python 2.7, i must install this, so if i use it, i'll forget the word GIL on CPython?

Answer

abarnert picture abarnert · Feb 21, 2013

No, concurrent.futures has almost nothing whatsoever to do with the GIL.

Using processes instead of threads is medicine for the GIL. (Of course, like all medicine, it has side effects. But it works.)

The futures module just gives you a simpler way to schedule and wait on tasks than using threading or multiprocessing directly. And it has the added advantage that you can swap between a thread pool and a process pool (and maybe even a greenlet loop, or something crazy you invent and build) without changing the future code. So, if you don't know whether your code will have GIL problems, you can build it to use threads, and then switch it to use processes with a one-line change, which is pretty nice.

But, if you use the ThreadPoolExecutor, it will have the exact same GIL issues as if you created a thread pool, task queue, etc. manually with threading and queue. If you use the ProcessPoolExecutor, it will avoid the GIL issues in the same way (and with the same tradeoffs) as if you used multiprocessing manually.

And the PyPI package is just a simple backport of the concurrent.futures module from 3.2 to 2.x (and 3.0-3.1). (It doesn't magically give you the new-and-sort-of-improved 3.2 GIL, or the more-improved 3.3 GIL, much less remove the GIL.)


I probably shouldn't even have mentioned the GIL changes, because this seems to have just added confusion… but now, let me try to straighten it out, by oversimplifying terribly.

If you have nothing but IO-bound work, threads are a great way to get concurrency, up to a reasonable limit. And 3.3 does make them work even better—but for most cases, 2.7 is already good enough, and, for most cases where it isn't, 3.3 still isn't good enough. If you want to handle 10000 simultaneous clients, you're going to want to use an event loop (e.g., twisted, tornado, gevent, tulip, etc.) instead of threads.

If you have any CPU-bound work, threads don't help parallelize that work at all. In fact, they make things worse. 3.3 makes that penalty not quite as bad, but it's still a penalty, and you still shouldn't ever do this. If you want to parallelize CPU work, you have to use processes, not threads. The only advantage of 3.3 is that futures is a little easier to use than multiprocessing, and comes built-in instead of needing to install it.

I don't want to discourage you from moving to 3.3, because it's a better implementation of a better language than 2.7. But better concurrency is not a reason to move.