asyncio.ensure_future vs. BaseEventLoop.create_task vs. simple coroutine?

crusaderky picture crusaderky · Mar 31, 2016 · Viewed 56.6k times · Source

I've seen several basic Python 3.5 tutorials on asyncio doing the same operation in various flavours. In this code:

import asyncio  

async def doit(i):
    print("Start %d" % i)
    await asyncio.sleep(3)
    print("End %d" % i)
    return i

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    #futures = [asyncio.ensure_future(doit(i), loop=loop) for i in range(10)]
    #futures = [loop.create_task(doit(i)) for i in range(10)]
    futures = [doit(i) for i in range(10)]
    result = loop.run_until_complete(asyncio.gather(*futures))
    print(result)

All the three variants above that define the futures variable achieve the same result; the only difference I can see is that with the third variant the execution is out of order (which should not matter in most cases). Is there any other difference? Are there cases where I can't just use the simplest variant (plain list of coroutines)?

Answer

Mikhail Gerasimov picture Mikhail Gerasimov · Apr 5, 2016

Actual info:

Starting from Python 3.7 asyncio.create_task(coro) high-level function was added for this purpose.

You should use it instead other ways of creating tasks from coroutimes. However if you need to create task from arbitrary awaitable, you should use asyncio.ensure_future(obj).


Old info:

ensure_future vs create_task

ensure_future is a method to create Task from coroutine. It creates tasks in different ways based on argument (including using of create_task for coroutines and future-like objects).

create_task is an abstract method of AbstractEventLoop. Different event loops can implement this function different ways.

You should use ensure_future to create tasks. You'll need create_task only if you're going to implement your own event loop type.

Upd:

@bj0 pointed at Guido's answer on this topic:

The point of ensure_future() is if you have something that could either be a coroutine or a Future (the latter includes a Task because that's a subclass of Future), and you want to be able to call a method on it that is only defined on Future (probably about the only useful example being cancel()). When it is already a Future (or Task) this does nothing; when it is a coroutine it wraps it in a Task.

If you know that you have a coroutine and you want it to be scheduled, the correct API to use is create_task(). The only time when you should be calling ensure_future() is when you are providing an API (like most of asyncio's own APIs) that accepts either a coroutine or a Future and you need to do something to it that requires you to have a Future.

and later:

In the end I still believe that ensure_future() is an appropriately obscure name for a rarely-needed piece of functionality. When creating a task from a coroutine you should use the appropriately-named loop.create_task(). Maybe there should be an alias for that asyncio.create_task()?

It's surprising to me. My main motivation to use ensure_future all along was that it's higher-level function comparing to loop's member create_task (discussion contains some ideas like adding asyncio.spawn or asyncio.create_task).

I can also point that in my opinion it's pretty convenient to use universal function that can handle any Awaitable rather than coroutines only.

However, Guido's answer is clear: "When creating a task from a coroutine you should use the appropriately-named loop.create_task()"

When coroutines should be wrapped in tasks?

Wrap coroutine in a Task - is a way to start this coroutine "in background". Here's example:

import asyncio


async def msg(text):
    await asyncio.sleep(0.1)
    print(text)


async def long_operation():
    print('long_operation started')
    await asyncio.sleep(3)
    print('long_operation finished')


async def main():
    await msg('first')

    # Now you want to start long_operation, but you don't want to wait it finised:
    # long_operation should be started, but second msg should be printed immediately.
    # Create task to do so:
    task = asyncio.ensure_future(long_operation())

    await msg('second')

    # Now, when you want, you can await task finised:
    await task


if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

Output:

first
long_operation started
second
long_operation finished

You can replace asyncio.ensure_future(long_operation()) with just await long_operation() to feel the difference.