The following behaviour seems rather counterintuitive to me (Python 3.4):
>>> [(yield i) for i in range(3)]
<generator object <listcomp> at 0x0245C148>
>>> list([(yield i) for i in range(3)])
[0, 1, 2]
>>> list((yield i) for i in range(3))
[0, None, 1, None, 2, None]
The intermediate values of the last line are actually not always None
, they are whatever we send
into the generator, equivalent (I guess) to the following generator:
def f():
for i in range(3):
yield (yield i)
It strikes me as funny that those three lines work at all. The Reference says that yield
is only allowed in a function definition (though I may be reading it wrong and/or it may simply have been copied from the older version). The first two lines produce a SyntaxError
in Python 2.7, but the third line doesn't.
Also, it seems odd
Could someone provide more information?
Note: this was a bug in the CPython's handling of
yield
in comprehensions and generator expressions, fixed in Python 3.8, with a deprecation warning in Python 3.7. See the Python bug report and the What's New entries for Python 3.7 and Python 3.8.
Generator expressions, and set and dict comprehensions are compiled to (generator) function objects. In Python 3, list comprehensions get the same treatment; they are all, in essence, a new nested scope.
You can see this if you try to disassemble a generator expression:
>>> dis.dis(compile("(i for i in range(3))", '', 'exec'))
1 0 LOAD_CONST 0 (<code object <genexpr> at 0x10f7530c0, file "", line 1>)
3 LOAD_CONST 1 ('<genexpr>')
6 MAKE_FUNCTION 0
9 LOAD_NAME 0 (range)
12 LOAD_CONST 2 (3)
15 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
18 GET_ITER
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 POP_TOP
23 LOAD_CONST 3 (None)
26 RETURN_VALUE
>>> dis.dis(compile("(i for i in range(3))", '', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 11 (to 17)
6 STORE_FAST 1 (i)
9 LOAD_FAST 1 (i)
12 YIELD_VALUE
13 POP_TOP
14 JUMP_ABSOLUTE 3
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
The above shows that a generator expression is compiled to a code object, loaded as a function (MAKE_FUNCTION
creates the function object from the code object). The .co_consts[0]
reference lets us see the code object generated for the expression, and it uses YIELD_VALUE
just like a generator function would.
As such, the yield
expression works in that context, as the compiler sees these as functions-in-disguise.
This is a bug; yield
has no place in these expressions. The Python grammar before Python 3.7 allows it (which is why the code is compilable), but the yield
expression specification shows that using yield
here should not actually work:
The yield expression is only used when defining a generator function and thus can only be used in the body of a function definition.
This has been confirmed to be a bug in issue 10544. The resolution of the bug is that using yield
and yield from
will raise a SyntaxError
in Python 3.8; in Python 3.7 it raises a DeprecationWarning
to ensure code stops using this construct. You'll see the same warning in Python 2.7.15 and up if you use the -3
command line switch enabling Python 3 compatibility warnings.
The 3.7.0b1 warning looks like this; turning warnings into errors gives you a SyntaxError
exception, like you would in 3.8:
>>> [(yield i) for i in range(3)]
<stdin>:1: DeprecationWarning: 'yield' inside list comprehension
<generator object <listcomp> at 0x1092ec7c8>
>>> import warnings
>>> warnings.simplefilter('error')
>>> [(yield i) for i in range(3)]
File "<stdin>", line 1
SyntaxError: 'yield' inside list comprehension
The differences between how yield
in a list comprehension and yield
in a generator expression operate stem from the differences in how these two expressions are implemented. In Python 3 a list comprehension uses LIST_APPEND
calls to add the top of the stack to the list being built, while a generator expression instead yields that value. Adding in (yield <expr>)
just adds another YIELD_VALUE
opcode to either:
>>> dis.dis(compile("[(yield i) for i in range(3)]", '', 'exec').co_consts[0])
1 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 13 (to 22)
9 STORE_FAST 1 (i)
12 LOAD_FAST 1 (i)
15 YIELD_VALUE
16 LIST_APPEND 2
19 JUMP_ABSOLUTE 6
>> 22 RETURN_VALUE
>>> dis.dis(compile("((yield i) for i in range(3))", '', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 12 (to 18)
6 STORE_FAST 1 (i)
9 LOAD_FAST 1 (i)
12 YIELD_VALUE
13 YIELD_VALUE
14 POP_TOP
15 JUMP_ABSOLUTE 3
>> 18 LOAD_CONST 0 (None)
21 RETURN_VALUE
The YIELD_VALUE
opcode at bytecode indexes 15 and 12 respectively is extra, a cuckoo in the nest. So for the list-comprehension-turned-generator you have 1 yield producing the top of the stack each time (replacing the top of the stack with the yield
return value), and for the generator expression variant you yield the top of the stack (the integer) and then yield again, but now the stack contains the return value of the yield
and you get None
that second time.
For the list comprehension then, the intended list
object output is still returned, but Python 3 sees this as a generator so the return value is instead attached to the StopIteration
exception as the value
attribute:
>>> from itertools import islice
>>> listgen = [(yield i) for i in range(3)]
>>> list(islice(listgen, 3)) # avoid exhausting the generator
[0, 1, 2]
>>> try:
... next(listgen)
... except StopIteration as si:
... print(si.value)
...
[None, None, None]
Those None
objects are the return values from the yield
expressions.
And to reiterate this again; this same issue applies to dictionary and set comprehension in Python 2 and Python 3 as well; in Python 2 the yield
return values are still added to the intended dictionary or set object, and the return value is 'yielded' last instead of attached to the StopIteration
exception:
>>> list({(yield k): (yield v) for k, v in {'foo': 'bar', 'spam': 'eggs'}.items()})
['bar', 'foo', 'eggs', 'spam', {None: None}]
>>> list({(yield i) for i in range(3)})
[0, 1, 2, set([None])]