Asynchronous background processes with web2py

Kenny Meyer picture Kenny Meyer · Dec 29, 2011 · Viewed 7.4k times · Source

I need to to handle a large (time and memory-consuming) process asynchronously in a web2py application called inside a controller method.

My specific use case is to call a process via stdlib.subprocess and wait for it to exit without blocking the web server, but I am open to alternative methods.

  • Hands-on examples would be a plus.
  • 3rd party library recommendations are welcome.
  • CRON scheduling is not required/wanted.

Answer

spiffytech picture spiffytech · Dec 29, 2011

Assuming you'll need to start multiple, possibly simultaneous, instances of the background task, the solution is a task queue. I've heard good things about Celery and RabbitMQ, if you're looking for 3rd-party options, and web2py includes it's own task queue system that might be sufficient for your needs.

With either tool, you'll define a function that encapsulates the operation you want the background process to perform. Then bring the task queue workers online. The web2py manual and forums indicate this can be done with an @reboot statement in the web2py cron system, which is triggered whenever the web server starts. There are probably other ways to start the workers if this is unsatisfactory.

In your controller you'll insert a task into the task queue, passing any necessary parameters as inputs to the function (the background function will not run in the same environment as the controller, so it won't have access to the session, DB, etc. unless you explicitly pass the appropriate values into the task function).

Now, to get the output of the background operation to the user. When you insert a task into the task queue, you should get back a unique ID for the task. You would then implement controller logic (either something that expects an AJAX call, or a page that keeps refreshing until the task completes) that calls the task queue's API to check the status of the specified task. If the task's status is "finished", return the data to the user. If not, keep waiting.