I just ran across something interesting that I thought I'd ask about.
Adding a dictionary to a set
, I had assumed that the dictionary would be added as a full dictionary, it isn't. Only the keys are added:
dicty = {"Key1": "Val1", "Key2": "Val2"}
setunion = set()
setunion.union(dicty)
=> set(['Key2', 'Key1'])
When you attempt to add it using set.add()
you get an error:
setadd = set()
setadd.add(dicty)
Traceback (most recent call last):
File "python", line 1, in <module>
TypeError: unhashable type: 'dict'
Obviously, this behaviour is very different from lists:
listy = []
listy.append(dicty)
listy
=> [{'Key2': 'Val2', 'Key1': 'Val1'}]
In the docs it says that sets are unordered collections of hashable objects, which is a hint to some of the issues above.
What's going on here? Set items have to be hashable, so clearly that has to do with why I'm only adding the keys to the set with .union()
, but why the error with .add()
?
Is there some usability reason behind the difference in behavior of sets from lists?
Is there a datatype in Python (or a library) that essentially functions like a list, but only keeps unique items?
No that's impossible by definition. The way hash tables (like dict
s and set
s) do lookups is fundamentally unique from the way arrays (like list
s) do lookups. The logical error is that if you have a datatype that only saves duplicates, what happens if you mutate one of the elements to be non-unique?
a, b = [0], [0, 1]
s = SpecialSet(a, b)
a.append(1) # NOW WHAT?!
If you want to add a dictionary to a set, you can add the dict.items
view of it (which is really just a list of tuples), but you have to cast to tuple first.
a = {1:2, 3:4}
s = set()
s.add(tuple(a.items()))
Then you'd have to re-cast to dict that once it leaves the set to get a dictionary back
for tup in s:
new_a = dict(tup)
A built-in frozendict
type was proposed in PEP416 but ultimately rejected.