How can I make a python dataclass hashable without making them immutable?

Brian C. picture Brian C. · Sep 18, 2018 · Viewed 17.6k times · Source

Say a I have a dataclass in python3. I want to be able to hash and order these objects. I do not want these to be immutable.

I only want them ordered/hashed on id.

I see in the docs that I can just implement _hash_ and all that but I'd like to get datacalsses to do the work for me because they are intended to handle this.

from dataclasses import dataclass, field

@dataclass(eq=True, order=True)
class Category:
    id: str = field(compare=True)
    name: str = field(default="set this in post_init", compare=False)

a = sorted(list(set([ Category(id='x'), Category(id='y')])))

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'Category'

Answer

Aran-Fey picture Aran-Fey · Sep 18, 2018

From the docs:

Here are the rules governing implicit creation of a __hash__() method:

[...]

If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, __hash__() will be left untouched meaning the __hash__() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).

Since you set eq=True and left frozen at the default (False), your dataclass is unhashable.

You have 3 options:

  • Set frozen=True (in addition to eq=True), which will make your class immutable and hashable.
  • Set unsafe_hash=True, which will create a __hash__ method but leave your class mutable, thus risking problems if an instance of your class is modified while stored in a dict or set:

    cat = Category('foo', 'bar')
    categories = {cat}
    cat.id = 'baz'
    
    print(cat in categories)  # False
    
  • Manually implement a __hash__ method.