Asynchronous Programming in Python Twisted

David picture David · Sep 17, 2008 · Viewed 8.7k times · Source

I'm having trouble developing a reverse proxy in Twisted. It works, but it seems overly complex and convoluted. So much of it feels like voodoo.

Are there any simple, solid examples of asynchronous program structure on the web or in books? A sort of best practices guide? When I complete my program I'd like to be able to still see the structure in some way, not be looking at a bowl of spaghetti.

Answer

Glyph picture Glyph · Sep 17, 2008

Twisted contains a large number of examples. One in particular, the "evolution of Finger" tutorial, contains a thorough explanation of how an asynchronous program grows from a very small kernel up to a complex system with lots of moving parts. Another one that might be of interest to you is the tutorial about simply writing servers.

The key thing to keep in mind about Twisted, or even other asynchronous networking libraries (such as asyncore, MINA, or ACE), is that your code only gets invoked when something happens. The part that I've heard most often sound like "voodoo" is the management of callbacks: for example, Deferred. If you're used to writing code that runs in a straight line, and only calls functions which return immediately with results, the idea of waiting for something to call you back might be confusing. But there's nothing magical, no "voodoo" about callbacks. At the lowest level, the reactor is just sitting around and waiting for one of a small number of things to happen:

  1. Data arrives on a connection (it will call dataReceived on a Protocol)
  2. Time has passed (it will call a function registered with callLater).
  3. A connection has been accepted (it will call buildProtocol on a factory registered with a listenXXX or connectXXX function).
  4. A connection has been dropped (it will call connectionLost on the appropriate Protocol)

Every asynchronous program starts by hooking up a few of these events and then kicking off the reactor to wait for them to happen. Of course, events that happen lead to more events that get hooked up or disconnected, and so your program goes on its merry way. Beyond that, there's nothing special about asynchronous program structure that are interesting or special; event handlers and callbacks are just objects, and your code is run in the usual way.

Here's a simple "event-driven engine" that shows you just how simple this process is.

# Engine
import time
class SimplestReactor(object):
    def __init__(self):
        self.events = []
        self.stopped = False

    def do(self, something):
        self.events.append(something)

    def run(self):
        while not self.stopped:
            time.sleep(0.1)
            if self.events:
                thisTurn = self.events.pop(0)
                thisTurn()

    def stop(self):
        self.stopped = True

reactor = SimplestReactor()

# Application    
def thing1():
    print 'Doing thing 1'
    reactor.do(thing2)
    reactor.do(thing3)

def thing2():
    print 'Doing thing 2'

def thing3():
    print 'Doing thing 3: and stopping'
    reactor.stop()

reactor.do(thing1)
print 'Running'
reactor.run()
print 'Done!'

At the core of libraries like Twisted, the function in the main loop is not sleep, but an operating system call like select() or poll(), as exposed by a module like the Python select module. I say "like" select, because this is an API that varies a lot between platforms, and almost every GUI toolkit has its own version. Twisted currently provides an abstract interface to 14 different variations on this theme. The common thing that such an API provides is provide a way to say "Here are a list of events that I'm waiting for. Go to sleep until one of them happens, then wake up and tell me which one of them it was."