Defining __repr__ when subclassing set in Python

me_and picture me_and · Dec 13, 2011 · Viewed 14k times · Source

I'm trying to subclass the set object in Python, using code similar to the below, but I can't work out a sensible definition of __repr__ to use.

class Alpha(set):
    def __init__(self, name, s=()):
        super(Alpha, self).__init__(s)
        self.name = name

I'd like to define __repr__ in such a way that I can get the following output:

>>> Alpha('Salem', (1,2,3))
Alpha('Salem', set([1, 2, 3]))

However, if I don't override __repr__, the output I get ignores the name value…

>>> Alpha('Salem', (1,2,3))
Alpha([1, 2, 3])

…while if I do override __repr__, I can't get direct access to the values in the set without creating a new set instance:

class Alpha(set):
    …
    def __repr__(self):
        return "%s(%r, %r)" % (self.__class__.__name__, self.name, set(self))

This works, but creating a new set instance for __repr__ that will then be disposed of seems clunky and inefficient to me.

Is there a better way to define __repr__ for this sort of class?

Edit: Another solution that has occurred to me: I can store the set locally. It seems slightly neater than the other options (creating and destroying something for every call of __repr__ or using some form of string manipulation), but still seems less than ideal to me.

class Alpha(set):
    def __init__(self, name, s=()):
        super(Alpha, self).__init__(s)
        self.name = name
        self._set = set(s)
    def __repr__(self):
        return "%s(%r, %r)" % (self.__class__.__name__, self.name, self._set)

Answer

jdi picture jdi · Dec 13, 2011

I think I have something that gets you what you want, in addition to showing some benchmarks. They are almost all equivalent though I am sure there is a difference in memory usage.

#!/usr/bin/env python

import time

class Alpha(set):
    def __init__(self, name, s=()):
            super(Alpha, self).__init__(s)
            self.name = name
    def __repr__(self):
            return '%s(%r, set(%r))' % (self.__class__.__name__, 
                                        self.name, 
                                        list(self))

class Alpha2(set):
    def __init__(self, name, s=()):
            super(Alpha2, self).__init__(s)
            self.name = name
    def __repr__(self):
            return '%s(%r, set(%r))' % (self.__class__.__name__, 
                                        self.name, 
                                        set(self))

class Alpha3(set):
    def __init__(self, name, s=()):
            super(Alpha3, self).__init__(s)
            self.name = name
    def __repr__(self):
            rep = super(Alpha3, self).__repr__()
            rep = rep.replace(self.__class__.__name__, 'set', 1)
            return '%s(%r, %s)' % (self.__class__.__name__, 
                                    self.name, 
                                    rep)

def timeit(exp, repeat=10000):
    results = []
    for _ in xrange(repeat):
        start = time.time()
        exec(exp)
        end = time.time()-start
        results.append(end*1000)
    return sum(results) / len(results)

if __name__ == "__main__":
    print "Alpha():  ", timeit("a = Alpha('test', (1,2,3,4,5))")
    print "Alpha2(): ", timeit("a = Alpha2('test', (1,2,3,4,5))")
    print "Alpha3(): ", timeit("a = Alpha3('test', (1,2,3,4,5))")

Results:

Alpha(): 0.0287627220154

Alpha2(): 0.0286467552185

Alpha3(): 0.0285225152969