Understanding __init_subclass__

EsotericVoid picture EsotericVoid · Jul 30, 2017 · Viewed 22.5k times · Source

I finally upgraded my python version and I was discovering the new features added. Among other things, I was scratching my head around the new __init_subclass__ method. From the docs:

This method is called whenever the containing class is subclassed. cls is then the new subclass. If defined as a normal instance method, this method is implicitly converted to a class method.

So I started to playing around with it a little bit, following the example in the docs:

class Philosopher:
    def __init_subclass__(cls, default_name, **kwargs):
        super().__init_subclass__(**kwargs)
        print(f"Called __init_subclass({cls}, {default_name})")
        cls.default_name = default_name

class AustralianPhilosopher(Philosopher, default_name="Bruce"):
    pass

class GermanPhilosopher(Philosopher, default_name="Nietzsche"):
    default_name = "Hegel"
    print("Set name to Hegel")

Bruce = AustralianPhilosopher()
Mistery = GermanPhilosopher()
print(Bruce.default_name)
print(Mistery.default_name)

Produces this output:

Called __init_subclass(<class '__main__.AustralianPhilosopher'>, 'Bruce')
'Set name to Hegel'
Called __init_subclass(<class '__main__.GermanPhilosopher'>, 'Nietzsche')
'Bruce'
'Nietzsche'

I understand that this method is called after the subclass definition, but my questions are particularly about the usage of this feature. I read the PEP 487 article as well, but didn't help me much. Where would this method be helpful? Is it for:

  • the superclass to register the subclasses upon creation?
  • forcing the subclass to set a field at definition time?

Also, do I need to understand the __set_name__ to fully comprehend its usage?

Answer

Martijn Pieters picture Martijn Pieters · Jul 30, 2017

PEP 487 sets out to take two common metaclass usecases and make them more accessible without having to understand all the ins and outs of metaclasses. The two new features, __init_subclass__ and __set_name__ are otherwise independent, they don't rely on one another.

__init_subclass__ is just a hook method. You can use it for anything you want. It is useful for both registering subclasses in some way, and for setting default attribute values on those subclasses.

We recently used this to provide 'adapters' for different version control systems, for example:

class RepositoryType(Enum):
    HG = auto()
    GIT = auto()
    SVN = auto()
    PERFORCE = auto()

class Repository():
    _registry = {t: {} for t in RepositoryType}

    def __init_subclass__(cls, scm_type=None, name=None, **kwargs):
        super().__init_subclass__(**kwargs)
        if scm_type is not None:
            cls._registry[scm_type][name] = cls

class MainHgRepository(Repository, scm_type=RepositoryType.HG, name='main'):
    pass

class GenericGitRepository(Repository, scm_type=RepositoryType.GIT):
    pass

This trivially let us define handler classes for specific repositories without having to resort to using a metaclass or decorators.