Creating an Ordered Counter

Sean picture Sean · Feb 17, 2016 · Viewed 10.9k times · Source

I've been reading into how super() works. I came across this recipe that demonstrates how to create an Ordered Counter:

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first seen'
     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__,
                            OrderedDict(self))
     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

For example:

oc = OrderedCounter('adddddbracadabra')

print(oc)

OrderedCounter(OrderedDict([('a', 5), ('d', 6), ('b', 2), ('r', 2), ('c', 1)]))

Is someone able to explain how this magically works?

This also appears in the Python documentation.

Answer

RootTwo picture RootTwo · Feb 17, 2016

OrderedCounter is given as an example in the OrderedDict documentation, and works without needing to override any methods:

class OrderedCounter(Counter, OrderedDict):
    pass

When a class method is called, Python has to find the correct method to execute. There is a defined order in which it searches the class hierarchy called the "method resolution order" or mro. The mro is stored in the attribute __mro__:

OrderedCounter.__mro__

(<class '__main__.OrderedCounter'>, <class 'collections.Counter'>, <class 'collections.OrderedDict'>, <class 'dict'>, <class 'object'>)

When an instance of an OrderedDict is calling __setitem__(), it searches the classes in order: OrderedCounter, Counter, OrderedDict (where it is found). So an statement like oc['a'] = 0 ends up calling OrderedDict.__setitem__().

In contrast, __getitem__ is not overridden by any of the subclasses in the mro, so count = oc['a'] is handled by dict.__getitem__().

oc = OrderedCounter()    
oc['a'] = 1             # this call uses OrderedDict.__setitem__
count = oc['a']         # this call uses dict.__getitem__

A more interesting call sequence occurs for a statement like oc.update('foobar'). First, Counter.update() gets called. The code for Counter.update() uses self[elem], which gets turned into a call to OrderedDict.__setitem__(). And the code for that calls dict.__setitem__().

If the base classes are reversed, it no longer works. Because the mro is different and the wrong methods get called.

class OrderedCounter(OrderedDict, Counter):   # <<<== doesn't work
    pass

More info on mro can be found in the Python 2.3 documentation.