Subclassing collections namedtuple

BoltzmannBrain picture BoltzmannBrain · Jun 2, 2017 · Viewed 13.3k times · Source

Python's namedtuple can be really useful as a lightweight, immutable data class. I like using them for bookkeeping parameters rather than dictionaries. When some more functionality is desired, such as a simple docstring or default values, you can easily refactor the namedtuple to a class. However, I've seen classes that inherit from namedtuple. What functionality are they gaining, and what performance are they losing? For example, I would implement this as

from collections import namedtuple

class Pokemon(namedtuple('Pokemon', 'name type level')):
    """
    Attributes
    ----------
    name : str
        What do you call your Pokemon?
    type : str
        grass, rock, electric, etc.
    level : int
        Experience level [0, 100]
    """
     __slots__ = ()

For the sole purpose of being able to document the attrs cleanly, and __slots__ is used to prevent the creation of a __dict__ (keeping the lightweight nature of namedtuples).

Is there a better recommendation of a lightweight data class for documenting parameters? Note I'm using Python 2.7.

Answer

Rick supports Monica picture Rick supports Monica · Jun 2, 2017

NEW UPDATE:

In python 3.6+, you can use the new typed syntax and create a typing.NamedTuple. The new syntax supports all the usual python class creation features (docstrings, multiple inheritance, default arguments, methods, etc etc are available as of 3.6.1):

import typing

class Pokemon(MyMixin, typing.NamedTuple):
    """
    Attributes
    ----------
    name : str
        What do you call your Pokemon?
    type : str
        grass, rock, electric, etc.
    level : int
        Experience level [0, 100]
    """
    name: str
    type: str
    level: int = 0 # 3.6.1 required for default args

    def method(self):
        # method work

The class objects created by this version are mostly equivalent to the original collections.namedtuple, except for a few details.

You can also use the same syntax as the old named tuple:

Pokemon = typing.NamedTuple('Pokemon', [('name', str), ('type', str), ('level', int)])

Original Answer


Short answer: no, unless you are using Python < 3.5

The P3 docs seem to imply pretty clearly that unless you need to add calculated fields (i.e., descriptors), subclassing namedtuple is not considered the canonical approach. This is because you can update the docstrings directly (they are now writable as of 3.5!).

Subclassing is not useful for adding new, stored fields. Instead, simply create a new named tuple type from the _fields attribute...

Docstrings can be customized by making direct assignments to the __doc__ fields...

UPDATE:

There are now a couple other compelling possibilities for lightweight data classes in the latest versions of Python.

One is types.SimpleNamespace (Python 3.3 and later). It is not structured like namedtuple, but structure isn't always necessary.

One thing to note about SimpleNamespace: by default it is required to explicitly designate the field names when instantiating the class. This can be got around fairly easily, though, with a call to super().__init__:

from types import SimpleNamespace

class Pokemon(SimpleNamespace):
    """
    Attributes
    ----------
    name : str
        What do you call your Pokemon?
    type : str
        grass, rock, electric, etc.
    level : int
        Experience level [0, 100]
    """
    __slots__ = ("name", "type", "level")
    # note that use of __init__ is optional
    def __init__(self, name, type, level):
        super().__init__(name=name, type=type, level=level)

Another intriguing option- which is available as of Python 3.7 - is dataclasses.dataclass (see also PEP 557):

from dataclasses import dataclass

@dataclass
class Pokemon:
    __slots__ = ("name", "type", "level")
    name: str  # What do you call your Pokemon?
    type: str  # grass, rock, electric, etc.
    level: int = 0  # Experience level [0, 100]

Note that both of these suggestions are mutable by default, and that __slots__ is not required for either one.