Are there any self-balancing binary search tree (RED-BLACK, AVL or others) built-in types in Python 2.7 or Python 3.x?
I am looking for something equivalent to Java's TreeMap or TreeSet.
If there are no such built-ins, why have they been ommited? Is there a special reason, for not including such tools?
There's no special reason, to my knowledge - I'd guess that the reason is that for so many applications the highly-tuned dict
and set
implementations (which are hash tables) work well. They're good enough in most cases. There are definitely situations where you need the performance characteristics of balanced binary search trees (like ordered traversal based on key- rather than addition-order), but those are far enough off the beaten path that people are happy with grabbing a third-party package in that case.
I've had a good experience using the bintrees package on PyPI. This has implementations of unbalanced, AVL and red-black binary trees, in both pure Python and as extensions written in Cython.
I think the rest of the reason is essentially historical accident. If the person who wrote bintrees lobbied for its inclusion in the stdlib, and was willing to put up with the constraints that imposes on maintenance and releases, it would probably go in. (Although the Cython dependency would cause a problem, I'd guess.)
Algorithmic complexity:
For hash tables (like dicts or sets), insertion and lookup are O(1), while for a balanced tree these are O(log(n)). In-order traversal of keys is O(n) in a tree, but to do the same thing with a hash table you need to sort the keys first, so it's O(n*log(n)). When you're picking which kind of data structure to use, you need to think about which operations you're going to be using, and pick the tradeoff that makes the most sense in your application.