Python custom set intersection

Ben Stott picture Ben Stott · May 30, 2012 · Viewed 7.1k times · Source

So, there exists an easy way to calculate the intersection of two sets via set.intersection(). However, I have the following problem:

class Person(Object):                    
    def __init__(self, name, age):                                                      
        self.name = name                                                                
        self.age = age                                                                  

l1 = [Person("Foo", 21), Person("Bar", 22)]                                             
l2 = [Person("Foo", 21), Person("Bar", 24)]                                             

union_list = list(set(l1).union(l2))                                           
# [Person("Foo", 21), Person("Bar", 22), Person("Bar", 24)]

(Object is a base-class provided by my ORM that implements basic __hash__ and __eq__ functionality, which essentially adds every member of the class to the hash. In other words, the __hash__ returned will be a hash of every element of the class)

At this stage, I would like to run a set intersection operation by .name only, to find, say, Person('Bar', -1).intersection(union_list) #= [Person("Bar", -1), Person("Bar", 22), Person("Bar", 24)]. (the typical .intersection() at this point would not give me anything, I can't override __hash__ or __eq__ on the Person class, as this would override the original set union (I think)

What's the best way to do this in Python 2.x?

EDIT: Note that the solution doesn't have to rely on a set. However, I need to find unions and then intersections, so it feels like this is amenable to a set (but I'm willing to accept solutions that use whatever magic you deem worthy, so long as it solves my problem!)

Answer

Jonas Byström picture Jonas Byström · May 30, 2012

Sounds like

>>> class Person:
...     def __init__(self, name, age):
...         self.name = name
...         self.age = age
...     def __eq__(self, other):
...         return self.name == other.name
...     def __hash__(self):
...         return hash(self.name)
...     def __str__(self):
...         return self.name
...
>>> l1 = [Person("Foo", 21), Person("Bar", 22)]
>>> l2 = [Person("Foo", 21), Person("Bar", 24)]
>>> union_list = list(set(l1).union(l2))
>>> [str(l) for l in union_list]
['Foo', 'Bar']

is what you want, since name is your unique key?