Local variables in nested functions

noio picture noio · Sep 14, 2012 · Viewed 11.2k times · Source

Okay, bear with me on this, I know it's going to look horribly convoluted, but please help me understand what's happening.

from functools import partial

class Cage(object):
    def __init__(self, animal):
        self.animal = animal

def gotimes(do_the_petting):
    do_the_petting()

def get_petters():
    for animal in ['cow', 'dog', 'cat']:
        cage = Cage(animal)

        def pet_function():
            print "Mary pets the " + cage.animal + "."

        yield (animal, partial(gotimes, pet_function))

funs = list(get_petters())

for name, f in funs:
    print name + ":", 
    f()

Gives:

cow: Mary pets the cat.
dog: Mary pets the cat.
cat: Mary pets the cat.

So basically, why am I not getting three different animals? Isn't the cage 'packaged' into the local scope of the nested function? If not, how does a call to the nested function look up the local variables?

I know that running into these kind of problems usually means one is 'doing it wrong', but I'd like to understand what happens.

Answer

Martijn Pieters picture Martijn Pieters · Sep 14, 2012

The nested function looks up variables from the parent scope when executed, not when defined.

The function body is compiled, and the 'free' variables (not defined in the function itself by assignment), are verified, then bound as closure cells to the function, with the code using an index to reference each cell. pet_function thus has one free variable (cage) which is then referenced via a closure cell, index 0. The closure itself points to the local variable cage in the get_petters function.

When you actually call the function, that closure is then used to look at the value of cage in the surrounding scope at the time you call the function. Here lies the problem. By the time you call your functions, the get_petters function is already done computing it's results. The cage local variable at some point during that execution was assigned each of the 'cow', 'dog', and 'cat' strings, but at the end of the function, cage contains that last value 'cat'. Thus, when you call each of the dynamically returned functions, you get the value 'cat' printed.

The work-around is to not rely on closures. You can use a partial function instead, create a new function scope, or bind the variable as a default value for a keyword parameter.

  • Partial function example, using functools.partial():

    from functools import partial
    
    def pet_function(cage=None):
        print "Mary pets the " + cage.animal + "."
    
    yield (animal, partial(gotimes, partial(pet_function, cage=cage)))
    
  • Creating a new scope example:

    def scoped_cage(cage=None):
        def pet_function():
            print "Mary pets the " + cage.animal + "."
        return pet_function
    
    yield (animal, partial(gotimes, scoped_cage(cage)))
    
  • Binding the variable as a default value for a keyword parameter:

    def pet_function(cage=cage):
        print "Mary pets the " + cage.animal + "."
    
    yield (animal, partial(gotimes, pet_function))
    

There is no need to define the scoped_cage function in the loop, compilation only takes place once, not on each iteration of the loop.