Python name mangling

Paul Manta picture Paul Manta · Sep 17, 2011 · Viewed 41.9k times · Source

In other languages, a general guideline that helps produce better code is always make everything as hidden as possible. If in doubt about whether a variable should be private or protected, it's better to go with private.

Does the same hold true for Python? Should I use two leading underscores on everything at first, and only make them less hidden (only one underscore) as I need them?

If the convention is to use only one underscore, I'd also like to know the rationale.

Here's a comment I left on JBernardo's answer. It explains why I asked this question and also why I'd like to know why Python is different from the other languages:

I come from languages that train you to think everything should be only as public as needed and no more. The reasoning is that this will reduce dependencies and make the code safer to alter. The Python way of doing things in reverse -- starting from public and going towards hidden -- is odd to me.

Answer

brandizzi picture brandizzi · Sep 17, 2011

When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:

class Stack(object):

    def __init__(self):
        self.__storage = [] # Too uptight

    def push(self, value):
        self.__storage.append(value)

write this by default:

class Stack(object):

    def __init__(self):
        self.storage = [] # No mangling

    def push(self, value):
        self.storage.append(value)

This is for sure a controversial way of doing things. Python newbies just hate it and even some old Python guys despise this default - but it is the default anyway, so I really recommend you to follow it, even if you feel uncomfortable.

If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:

class Stack(object):

    def __init__(self):
        self._storage = [] # This is ok but pythonistas use it to be relaxed about it

    def push(self, value):
        self._storage.append(value)

This can be useful, too, for avoiding conflict between property names and attribute names:

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self._age = age if age >= 0 else 0

     @property
     def age(self):
         return self._age

     @age.setter
     def age(self, age):
         if age >= 0:
             self._age = age
         else:
             self._age  = 0

What about the double underscore? Well, the double underscore magic is used mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be quite useful if you write a class that is expected to be extended many times.

If you want to use it for other purposes, you can, but it is neither usual nor recommended.

EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are a lot of reasons for that - most of them controversial... Let us see some of them.

Python has properties

Most OO languages today use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes, because no one would change values inside the objects recklessly.

However, it is not so simple. For example, Java classes do have a lot attributes and getters that just get the values and setters that just set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, in practice, you just write this whole lot of code to get one public field, since you can change its value using the getters and setters.

So why to follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java, because if you decide to add some validation to your attribute, it would require you to change all

person.age = age;

in your code to, let us say,

person.setAge(age);

setAge() being:

public void setAge(int age) {
    if (age >= 0) {
        this.age = age;
    } else {
        this.age = 0;
    }
}

So in Java (and other languages), the default is to use getters and setters anyway, because they can be annoying to write but can spare you a lot of time if you find yourself in the situation I've described.

However, you do not need to do it in Python, since Python has properties. If you have this class:

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self.age = age

and then you decide to validate ages, you do not need to change the person.age = age pieces of your code. Just add a property (as shown below)

 class Person(object):
     def __init__(self, name, age):
         self.name = name
         self._age = age if age >= 0 else 0

     @property
     def age(self):
         return self._age

     @age.setter
     def age(self, age):
         if age >= 0:
             self._age = age
         else:
             self._age  = 0

If you can do it and still use person.age = age, why would you add private fields and getters and setters?

(Also, see Python is not Java and this article about the harms of using getters and setters.).

Everything is visible anyway - and trying to hide just complicates your work

Even in languages where there are private attributes, you can access them through some kind of reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a hard way of doing what you could do with public attributes.

Since Python is a very dynamic language, it is just counterproductive to add this burden to your classes.

The problem is not being possible to see - it is being required to see

For a Pythonista, encapsulation is not the inability of seeing the internals of classes, but the possibility of avoiding looking at it. What I mean is, encapsulation is the property of a component which allows it to be used without the user being concerned about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).

Now, if you wrote your class in such a way you can use it without having to think about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good and the rest is details.

Guido said so

Well, this is not controversial: he said so, actually. (Look for "open kimono.")

This is culture

Yes, there are some reasons, but no critical reason. This is mostly a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.

Since there already is this culture, you are well advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the __ from your code when you ask a question in Stack Overflow :)