Small Tables in Python?

akoumjian picture akoumjian · Sep 24, 2009 · Viewed 13.9k times · Source

Let's say I don't have more than one or two dozen objects with different properties, such as the following:

UID, Name, Value, Color, Type, Location

I want to be able to call up all objects with Location = "Boston", or Type = "Primary". Classic database query type stuff.

Most table solutions (pytables, *sql) are really overkill for such a small set of data. Should I simply iterate over all the objects and create a separate dictionary for each data column (adding values to dictionaries as I add new objects)?

This would create dicts like this:

{'Boston' : [234, 654, 234], 'Chicago' : [324, 765, 342] } - where those 3 digit entries represent things like UID's.

As you can see, querying this would be a bit of a pain.

Any thoughts of an alternative?

Answer

Steven Kryskalla picture Steven Kryskalla · Sep 24, 2009

For small relational problems I love using Python's builtin sets.

For the example of location = 'Boston' OR type = 'Primary', if you had this data:

users = {
   1: dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   2: dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   3: dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   #...
}

You can do the WHERE ... OR ... query like this:

set1 = set(u for u in users if users[u]['Location'] == 'Boston')
set2 = set(u for u in users if users[u]['Type'] == 'Primary')
result = set1.union(set2)

Or with just one expression:

result = set(u for u in users if users[u]['Location'] == 'Boston'
                              or users[u]['Type'] == 'Primary')

You can also use the functions in itertools to create fairly efficient queries of the data. For example if you want to do something similar to a GROUP BY city:

cities = ('Boston', 'New York', 'Chicago')
cities_users = dict(map(lambda city: (city, ifilter(lambda u: users[u]['Location'] == city, users)), cities))

You could also build indexes manually (build a dict mapping Location to User ID) to speed things up. If this becomes too slow or unwieldy then I would probably switch to sqlite, which is now included in the Python (2.5) standard library.