As Wikipedia states:
Green threads emulate multi-threaded environments without relying on any native OS capabilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support.
Python's threads are implemented as pthreads (kernel threads)
,
and because of the global interpreter lock (GIL), a Python process only runs one thread at a time.
[QUESTION]
But in the case of Green-threads
(or so-called greenlet or tasklets),
- Does the
GIL
affect them? Can there be more than one greenlet running at a time?- What are the pitfalls of using greenlets or tasklets?
- If I use greenlets, how many of them can a process can handle? (I am wondering because in a single process you can open threads up to ulimit(-s, -v) set in your *ix system.)
I need a little insight, and it would help if someone could share their experience, or guide me to the right path.
You can think of greenlets more like cooperative threads. What this means is that there is no scheduler pre-emptively switching between your threads at any given moment - instead your greenlets voluntarily/explicitly give up control to one another at specified points in your code.
Does the GIL affect them? Can there be more than one greenlet running at a time?
Only one code path is running at a time - the advantage is you have ultimate control over which one that is.
What are the pitfalls of using greenlets or tasklets?
You need to be more careful - a badly written greenlet will not yield control to other greenlets. On the other hand, since you know when a greenlet will context switch, you may be able to get away with not creating locks for shared data-structures.
If I use greenlets, how many of them can a process can handle? (I am wondering because in a single process you can open threads up to umask limit set in your *ix system.)
With regular threads, the more you have the more scheduler overhead you have. Also regular threads still have a relatively high context-switch overhead. Greenlets do not have this overhead associated with them. From the bottle documentation:
Most servers limit the size of their worker pools to a relatively low number of concurrent threads, due to the high overhead involved in switching between and creating new threads. While threads are cheap compared to processes (forks), they are still expensive to create for each new connection.
The gevent module adds greenlets to the mix. Greenlets behave similar to traditional threads, but are very cheap to create. A gevent-based server can spawn thousands of greenlets (one for each connection) with almost no overhead. Blocking individual greenlets has no impact on the servers ability to accept new requests. The number of concurrent connections is virtually unlimited.
There's also some further reading here if you're interested: http://sdiehl.github.io/gevent-tutorial/