What is the purpose of collections.ChainMap?

alecxe picture alecxe · Apr 30, 2014 · Viewed 18.2k times · Source

In Python 3.3 a ChainMap class was added to the collections module:

A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.

Example:

>>> from collections import ChainMap
>>> x = {'a': 1, 'b': 2}
>>> y = {'b': 10, 'c': 11}
>>> z = ChainMap(y, x)
>>> for k, v in z.items():
        print(k, v)
a 1
c 11
b 10

It was motivated by this issue and made public by this one (no PEP was created).

As far as I understand, it is an alternative to having an extra dictionary and maintaining it with update()s.

The questions are:

  • What use cases does ChainMap cover?
  • Are there any real world examples of ChainMap?
  • Is it used in third-party libraries that switched to python3?

Bonus question: is there a way to use it on Python2.x?


I've heard about it in Transforming Code into Beautiful, Idiomatic Python PyCon talk by Raymond Hettinger and I'd like to add it to my toolkit, but I lack in understanding when should I use it.

Answer

shx2 picture shx2 · May 3, 2014

I like @b4hand's examples, and indeed I have used in the past ChainMap-like structures (but not ChainMap itself) for the two purposes he mentions: multi-layered configuration overrides, and variable stack/scope emulation.

I'd like to point out two other motivations/advantages/differences of ChainMap, compared to using a dict-update loop, thus only storing the "final" version":

  1. More information: since a ChainMap structure is "layered", it supports answering question like: Am I getting the "default" value, or an overridden one? What is the original ("default") value? At what level did the value get overridden (borrowing @b4hand's config example: user-config or command-line-overrides)? Using a simple dict, the information needed for answering these questions is already lost.

  2. Speed tradeoff: suppose you have N layers and at most M keys in each, constructing a ChainMap takes O(N) and each lookup O(N) worst-case[*], while construction of a dict using an update-loop takes O(NM) and each lookup O(1). This means that if you construct often and only perform a few lookups each time, or if M is big, ChainMap's lazy-construction approach works in your favor.

[*] The analysis in (2) assumes dict-access is O(1), when in fact it is O(1) on average, and O(M) worst case. See more details here.