In Python 3.3 a ChainMap
class was added to the collections
module:
A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.
Example:
>>> from collections import ChainMap
>>> x = {'a': 1, 'b': 2}
>>> y = {'b': 10, 'c': 11}
>>> z = ChainMap(y, x)
>>> for k, v in z.items():
print(k, v)
a 1
c 11
b 10
It was motivated by this issue and made public by this one (no PEP
was created).
As far as I understand, it is an alternative to having an extra dictionary and maintaining it with update()
s.
The questions are:
ChainMap
cover?ChainMap
? Bonus question: is there a way to use it on Python2.x?
I've heard about it in Transforming Code into Beautiful, Idiomatic Python
PyCon talk by Raymond Hettinger and I'd like to add it to my toolkit, but I lack in understanding when should I use it.
I like @b4hand's examples, and indeed I have used in the past ChainMap-like structures (but not ChainMap itself) for the two purposes he mentions: multi-layered configuration overrides, and variable stack/scope emulation.
I'd like to point out two other motivations/advantages/differences of ChainMap
, compared to using a dict-update loop, thus only storing the "final" version":
More information: since a ChainMap structure is "layered", it supports answering question like: Am I getting the "default" value, or an overridden one? What is the original ("default") value? At what level did the value get overridden (borrowing @b4hand's config example: user-config or command-line-overrides)? Using a simple dict, the information needed for answering these questions is already lost.
Speed tradeoff: suppose you have N
layers and at most M
keys in each, constructing a ChainMap takes O(N)
and each lookup O(N)
worst-case[*], while construction of a dict using an update-loop takes O(NM)
and each lookup O(1)
. This means that if you construct often and only perform a few lookups each time, or if M
is big, ChainMap's lazy-construction approach works in your favor.
[*] The analysis in (2) assumes dict-access is O(1)
, when in fact it is O(1)
on average, and O(M)
worst case. See more details here.