Correct Style for Python functions that mutate the argument

Neil Du Toit picture Neil Du Toit · Sep 25, 2014 · Viewed 11.2k times · Source

I would like to write a Python function that mutates one of the arguments (which is a list, ie, mutable). Something like this:

def change(array):
   array.append(4)

change(array)

I'm more familiar with passing by value than Python's setup (whatever you decide to call it). So I would usually write such a function like this:

def change(array):
  array.append(4)
  return array

array = change(array)

Here's my confusion. Since I can just mutate the argument, the second method would seem redundant. But the first one feels wrong. Also, my particular function will have several parameters, only one of which will change. The second method makes it clear what argument is changing (because it is assigned to the variable). The first method gives no indication. Is there a convention? Which is 'better'? Thank you.

Answer

abarnert picture abarnert · Sep 25, 2014

The convention in Python is that functions either mutate something, or return something, not both.

If both are useful, you conventionally write two separate functions, with the mutator named for an active verb like change, and the non-mutator named for a participle like changed.

Almost everything in builtins and the stdlib follows this pattern. The list.append method you're calling returns nothing. Same with list.sort—but sorted leaves its argument alone and instead returns a new sorted copy.

There are a handful of exceptions for some of the special methods (e.g., __iadd__ is supposed to mutate and then return self), and a few cases where there clearly has to be one thing getting mutating and a different thing getting returned (like list.pop), and for libraries that are attempting to use Python as a sort of domain-specific language where being consistent with the target domain's idioms is more important than being consistent with Python's idioms (e.g., some SQL query expression libraries). Like all conventions, this one is followed unless there's a good reason not to.


So, why was Python designed this way?

Well, for one thing, it makes certain errors obvious. If you expected a function to be non-mutating and return a value, it'll be pretty obvious that you were wrong, because you'll get an error like AttributeError: 'NoneType' object has no attribute 'foo'.

It also makes conceptual sense: a function that returns nothing must have side-effects, or why would anyone have written it?

But there's also the fact that each statement in Python mutates exactly one thing—almost always the leftmost object in the statement. In other languages, assignment is an expression, mutating functions return self, and you can chain up a whole bunch of mutations into a single line of code, and that makes it harder to see the state changes at a glance, reason about them in detail, or step through them in a debugger.

Of course all of this is a tradeoff—it makes some code more verbose in Python than it would be in, say, JavaScript—but it's a tradeoff that's deeply embedded in Python's design.