Tornado streaming HTTP response as AsyncHTTPClient receives chunks

majackson picture majackson · Nov 16, 2013 · Viewed 9.5k times · Source

I'm trying to write a Tornado request handler which makes asynchronous HTTP requests, and returns data to the client as it receives it from it's async requests. Unfortunately, I'm unable to get Tornado to return any data to the client until all of it's Async HTTPRequests have completed.

A demo of my request handler is below.

class StreamingHandler(web.RequestHandler):

    all_requested = False
    requests = []

    @web.asynchronous
    def get(self):

        http_client = httpclient.AsyncHTTPClient()
        self.write('some opening')

        big_request = httpclient.HTTPRequest(url='[some_big_request]', streaming_callback=self.on_chunk)
        small_request = httpclient.HTTPRequest(url='[some_small_request]', streaming_callback=self.on_chunk)

        self.requests.append(http_client.fetch(big_request, callback=self.on_response_complete))
        self.requests.append(http_client.fetch(small_request, callback=self.on_response_complete))

        self.all_requested = True

    def on_chunk(self, chunk):
        self.write('some chunk')
        self.flush()

    def on_response_complete(self, response):
        if self.all_requested and all(request.done() for request in self.requests):
            self.write('some closing')
            self.finish()

I would expect a GET request to this handler to initially return the text 'some opening', then quite quickly return 'some chunk' for the small request, and later return 'some chunk' (potentially multiple times) for the larger request, before finally returning 'some closing', and closing the connection. Instead, after making the connection, the client waits a few seconds for all requests to complete, and then receives all of the HTTPResponse at once, before closing.

How would I go about getting my desired behaviour from Tornado?

Thanks in advance!

Answer

Blender picture Blender · Nov 18, 2013

Decorate your method with gen.coroutine and yield a list of futures. Here's a simple example:

from tornado import gen, web, httpclient

class StreamingHandler(web.RequestHandler):
    @web.asynchronous
    @gen.coroutine
    def get(self):
        client = httpclient.AsyncHTTPClient()

        self.write('some opening')
        self.flush()

        requests = [
            httpclient.HTTPRequest(
                url='http://httpbin.org/delay/' + str(delay),
                streaming_callback=self.on_chunk
            ) for delay in [5, 4, 3, 2, 1]
        ]

        # `map()` doesn't return a list in Python 3
        yield list(map(client.fetch, requests))

        self.write('some closing')
        self.finish()

    def on_chunk(self, chunk):
        self.write('some chunk')
        self.flush()

Notice that even though the requests are yielded "backwards", the first chunk will still be received after about a second. If you sent them out synchronously, it'd take you 15 seconds. When you request them asynchronously, it takes you just 5.