What exactly is contained within a obj.__closure__?

user621819 picture user621819 · Jan 19, 2013 · Viewed 9.7k times · Source

Beazley pg 100 mentions:

>>>python.__closure__
(<cell at 0x67f50: str object at 0x69230>,)
>>>python.__closure__[0].cell_contents

my understanding is that __closure__ is a list but what's all this cell stuff and str object?? That looks like a 1-ary tuple?

Answer

Martijn Pieters picture Martijn Pieters · Jan 19, 2013

Closure cells refer to values needed by the function but are taken from the surrounding scope.

When Python compiles a nested function, it notes any variables that it references but are only defined in a parent function (not globals) in the code objects for both the nested function and the parent scope. These are the co_freevars and co_cellvars attributes on the __code__ objects of these functions, respectively.

Then, when you actually create the nested function (which happens when the parent function is executed), those references are then used to attach a closure to the nested function.

A function closure holds a tuple of cells, one each for each free variable (named in co_freevars); cells are special references to local variables of a parent scope, that follow the values those local variables point to. This is best illustrated with an example:

def foo():
    def bar():
        print(spam)

    spam = 'ham'
    bar()
    spam = 'eggs'
    bar()
    return bar

b = foo()
b()

In the above example, the function bar has one closure cell, which points to spam in the function foo. The cell follows the value of spam. More importantly, once foo() completes and bar is returned, the cell continues to reference the value (the string eggs) even though the variable spam inside foo no longer exists.

Thus, the above code outputs:

>>> b=foo()
ham
eggs
>>> b()
eggs

and b.__closure__[0].cell_contents is 'eggs'.

Note that the closure is dereferenced when bar() is called; the closure doesn't capture the value here. That makes a difference when you produce nested functions (with lambda expressions or def statements) that reference the loop variable:

def foo():
    bar = []
    for spam in ('ham', 'eggs', 'salad'):
        bar.append(lambda: spam)
    return bar

for bar in foo():
    print bar()

The above will print salad three times in a row, because all three lambda functions reference the spam variable, not the value it was bound to when the function object was created. By the time the for loop finishes, spam was bound to 'salad', so all three closures will resolve to that value.