Tornado/Twisted - Celery - Gevent Comparison

arnoutaertgeerts picture arnoutaertgeerts · Jun 15, 2013 · Viewed 9.3k times · Source

I'm having a bit of trouble understanding the differences between these three frameworks:

These three frameworks can be used to run code at the same time but do this in different way using a different amount of threads/processes or codestyle. This is how I'm understanding the differences right now:

  • Tornado/Twisted use asynchronous code controlled by an I/O loop. This allows the code to be ran on a single thread (multiple threads are useless because if you have non-blocking code this is unnecessary)
  • Celery uses a task based system to run code asynchronously, the code in itself is still synchronous. A main process exists that is able to distribute the different tasks between other workers on different processes.
  • Gevent uses a thread based system and spawns a thread to process different incomming connections.

The Questions I'm having right now are:

  1. Is my understanding of these frameworks correct?
  2. The major difference between a thread and process is that different threads use the same memory whereas processes do not. Is one process normally run on one server core? (And thus making Celery hard to implement on a small server)
  3. If we are talking about webapplications and sockets:

Tornado/Twisted Are able to accept (almost) any amount of sockets because they use asyncronous code and queue the request in the I/O loop.

Are Celery/Gevent able to this? Do they have to spawn a new process/thread to be able to accept a new socket?

I'm trying to figure out which of these technologies is best suited to built a real-time web application.

Answer

Philip Cristiano picture Philip Cristiano · Jun 16, 2013
  1. Gevent uses greenlets instead of threads on an IO loop implicitly, so there is no reactor / IO loop to manually start in the case of Twtisted/Tornado. It also has the ability to monkey patch existing libraries to support it's evented operation, Tornado and Twisted require specific libraries to work with their event loops although you will find many already exist.

    Celery is made much more for background processing to offload expensive computations to another process/server.

  2. Processes can share memory but not in the same way that threads do. Threads in CPython suffer from the GIL and generally it is worth not using a threaded solution if you are doing anything CPU intensive.

    I'm not sure of Celery's memory requirements but if you are using 1 web process and 1 background process you should be fine even on a 256MB VPS, although more is better if you are supporting many connections.

  3. The number of sockets that can be handled with Tornado/Twisted/Gevent will likely be bounded by the amount and frequency of IO done per socket. Low frequency/low bandwidth sockets are much easier to support a large number of concurrent connections as they will mostly be idle.

    Celery will still require some application to listen for sockets and make calls to be processed with the Celery daemon. It supports Gevent as well so you can handle multiple tasks concurrently if required.