Understanding the difference between `self`and `cls` and whether they refer to the same attributes

ymmx picture ymmx · Feb 20, 2018 · Viewed 11.4k times · Source

I'm trying to understand if there are differences between self and cls but I'm struggling, even though a lot of discussion on this topic exists. For instance:

class maclass():
    A = "class method"

    def __init__(self):
        self.B = "instance method"

    def getA_s(self):
        print(self.A)

    def getA_c(cls):
        print(cls.A)

    def getB_s(self):
        print(self.B)

    def getB_c(cls):
        print(cls.B)

C = maclass()
C.getA_s()
C.getA_c()
C.getB_s()
C.getB_c()

which give me:

class method
class method
instance method
instance method

So whether I use self or cls, it always refers to the same variable. When I add a self.A in the Init__, the cls.A is just replaced

def __init__(self):
        self.B = "instance method"
        self.A = "new instance method"

and I get:

new instance method
new instance method
instance method
instance method

I don't understand the point of having two ways to call a class member if they are the same? I know this is a common question on this forum, yet I really don't understand why we would use different words to refer to the same thing (we even could use any variable name instead of self or cls).

update

In the following case:

class maclass():
    A = "class method, "

    def __init__(self):
        self.A = "instance method, "

    def getA_s(self):
        print(self.A) #give me "instance method, "

    @classmethod
    def getA_c(cls):
        print(cls.A) #give me "class method, "

C = maclass()
C.getA_s()
C.getA_c()
print(' ')
print(C.A) #give me "instance method, "

I get :

instance method, 
class method, 

instance method,    

So in this case, in maclass: cls.A and self.A do not refer to the same variable.

Answer

Martijn Pieters picture Martijn Pieters · Feb 20, 2018

All your methods are instance methods. None of them are class methods.

The first argument to a method is named self only by convention. You can name it anything you want, and naming it cls instead will not make it a reference to the class. That the first argument is bound to an instance is due to how method lookup works (accessing C.getA_s produces a bound method object, and calling that object causes C to be passed into the original function getA_s), the names of the parameters play no role.

In your methods, you are merely referencing instance attributes. That the A attribute is ultimately only defined on the class doesn't matter, you are still accessing that attribute through C.A (where C is the instance you created), not maclass.A. Looking up an attribute on the instance will also find attributes defined on the class if there is no instance attribute shadowing it.

To make a method a class method, decorate it with the @classmethod decorator:

@classmethod
def getA_c(cls):
    print(cls.A)

Now cls will always be a reference to the class, never to the instance. I need to stress again that it doesn't actually matter to Python what name I picked for that first argument, but cls is the convention here as that makes it easier to remind the reader that this method is bound to the class object.

Note that if you do this for the getB_c() method, then trying to access cls.B in the method will fail because there is no B attribute on the maclass class object.

That's because classmethod wraps the function in a descriptor object that overrides the normal function binding behaviour. It is the descriptor protocol that causes methods to be bound to instances when accessed as attributes on the instance, a classmethod object redirects that binding process.

Here is a short demonstration with inline comments, I used the Python convertions for naming classes (using CamelCase), and for instances, attributes, functions and methods (using snake_case):

>>> class MyClass():
...     class_attribute = "String attribute on the class"
...     def __init__(self):
...         self.instance_attribute = "String attribute on the instance"
...     @classmethod
...     def get_class_attribute(cls):
...         return cls.class_attribute
...     def get_instance_attribute(self):
...         return self.instance_attribute
...     @classmethod
...     def get_instance_attribute_on_class(cls):
...         return cls.instance_attribute
...
>>> instance = MyClass()
>>> instance.class_attribute  # class attributes are visible on the instance
'String attribute on the class'
>>> MyClass.class_attribute   # class attributes are also visible on the class
'String attribute on the class'
>>> instance.get_class_attribute()  # bound to the class, but that doesn't matter here
'String attribute on the class'
>>> instance.class_attribute = "String attribute value overriding the class attribute"
>>> instance.get_class_attribute()  # bound to the class, so the class attribute is found
'String attribute on the class'
>>> MyClass.get_instance_attribute_on_class()   # fails, there is instance_attribute on the class
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in get_instance_attribute_on_class
AttributeError: type object 'MyClass' has no attribute 'instance_attribute'

Note that the class method accesses the class attribute even though we set an attribute with the same name on the instance.

Next is binding behaviour:

>>> MyClass.get_instance_attribute   # accessing the method on the class gives you the function
<function MyClass.get_instance_attribute at 0x10f94f268>
>>> instance.get_instance_attribute  # accessing the method on the instance gives you the bound method
<bound method MyClass.get_instance_attribute of <__main__.MyClass object at 0x10f92b5f8>>
>>> MyClass.get_class_attribute      # class methods are always bound, to the class
<bound method MyClass.get_class_attribute of <class '__main__.MyClass'>>
>>> instance.get_class_attribute     # class methods are always bound, to the class
<bound method MyClass.get_class_attribute of <class '__main__.MyClass'>>

The bound methods tell you what they are bound to, calling the method passes in that bound object as the first argument. That object can also be introspected by looking at the __self__ attribute of a bound method:

>>> instance.get_instance_attribute.__self__  # the instance
<__main__.MyClass object at 0x10f92b5f8>
>>> instance.get_class_attribute.__self__     # the class
<class '__main__.MyClass'>