Subclassing dict: should dict.__init__() be called?

Eric O Lebigot picture Eric O Lebigot · Jan 9, 2010 · Viewed 19k times · Source

Here is a twofold question, with a theoretical part, and a practical one:

When subclassing dict:

class ImageDB(dict):
    def __init__(self, directory):
        dict.__init__(self)  # Necessary?? 
        ...

should dict.__init__(self) be called, just as a "safety" measure (e.g., in case there are some non-trivial implementation details that matter)? is there a risk that the code break with a future version of Python if dict.__init__() is not called? I'm looking for a fundamental reason of doing one thing or the other, here (practically, calling dict.__init__() is safe).

My guess is that when ImageDB.__init__(self, directory) is called, self is already a new empty dict object, and that there is therefore no need to call dict.__init__ (I do want the dict to be empty, at first). Is this correct?

Edit:

The more practical question behind the fundamental question above is the following. I was thinking of subclassing dict because I would use the db[…] syntax quite often (instead of doing db.contents[…] all the time); the object's only data (attribute) is indeed really a dict. I want to add a few methods to the database (such as get_image_by_name(), or get_image_by_code(), for instance), and only override the __init__(), because the image database is defined by the directory that contains it.

In summary, the (practical) question could be: what is a good implementation for something that behaves like a dictionary, except that its initialization is different (it only takes a directory name), and that it has additional methods?

"Factories" were mentioned in many answers. So I guess it all boils down to: do you subclass dict, override __init__() and add methods, or do you write a (factory) function that returns a dict, to which you add methods? I'm inclined to prefer the first solution, because the factory function returns an object whose type does not indicate that it has additional semantics and methods, but what do you think?

Edit 2:

I gather from everybody's answer that it is not a good idea to subclass dict when the new class "is not a dictionary", and in particular when its __init__ method cannot take the same arguments as dict's __init__ (which is the case in the "practical question" above). In other words, if I understand correctly, the consensus seems to be: when you subclass, all methods (including initialization) must have the same signature as the base class methods. This allows isinstance(subclass_instance, dict) to guarantee that subclass_instance.__init__() can be used like dict.__init__(), for instance.

Another practical question then pops up: how should a class which is just like dict, except for its initialization method, be implemented? without subclassing? this would require some bothersome boilerplate code, no?

Answer

Alan Franzoni picture Alan Franzoni · Jan 9, 2010

You should probably call dict.__init__(self) when subclassing; in fact, you don't know what's happening precisely in dict (since it's a builtin), and that might vary across versions and implementations. Not calling it may result in improper behaviour, since you can't know where dict is holding its internal data structures.

By the way, you didn't tell us what you want to do; if you want a class with dict (mapping) behaviour, and you don't really need a dict (e.g. there's no code doing isinstance(x, dict) anywhere in your software, as it should be), you're probably better off at using UserDict.UserDict or UserDict.DictMixin if you're on python <= 2.5, or collections.MutableMapping if you're on python >= 2.6 . Those will provide your class with an excellent dict behaviour.

EDIT: I read in another comment that you're not overriding any of dict's method! Then there's no point in subclassing at all, don't do it.

def createImageDb(directory):
    d = {}
    # do something to fill in the dict
    return d

EDIT 2: you want to inherit from dict to add new methods, but you don't need to override any. Than a good choice might be:

class MyContainer(dict):
    def newmethod1(self, args):
        pass

    def newmethod2(self, args2):
        pass


def createImageDb(directory):
    d = MyContainer()
    # fill the container
    return d

By the way: what methods are you adding? Are you sure you're creating a good abstraction? Maybe you'd better use a class which defines the methods you need and use a "normal" dict internally to it.

Factory func: http://en.wikipedia.org/wiki/Factory_method_pattern

It's simply a way of delegating the construction of an instance to a function instead of overriding/changing its constructors.