Let's say I don't have more than one or two dozen objects with different properties, such as the following:
UID, Name, Value, Color, Type, Location
I want to be able to call up all objects with Location = "Boston", or Type = "Primary". Classic database query type stuff.
Most table solutions (pytables, *sql) are really overkill for such a small set of data. Should I simply iterate over all the objects and create a separate dictionary for each data column (adding values to dictionaries as I add new objects)?
This would create dicts like this:
{'Boston' : [234, 654, 234], 'Chicago' : [324, 765, 342] } - where those 3 digit entries represent things like UID's.
As you can see, querying this would be a bit of a pain.
Any thoughts of an alternative?
For small relational problems I love using Python's builtin sets.
For the example of location = 'Boston' OR type = 'Primary', if you had this data:
users = {
1: dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
2: dict(Name="Mr. Bar", Location="New York", Type="Primary"),
3: dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
#...
}
You can do the WHERE ... OR ...
query like this:
set1 = set(u for u in users if users[u]['Location'] == 'Boston')
set2 = set(u for u in users if users[u]['Type'] == 'Primary')
result = set1.union(set2)
Or with just one expression:
result = set(u for u in users if users[u]['Location'] == 'Boston'
or users[u]['Type'] == 'Primary')
You can also use the functions in itertools to create fairly efficient queries of the data. For example if you want to do something similar to a GROUP BY city
:
cities = ('Boston', 'New York', 'Chicago')
cities_users = dict(map(lambda city: (city, ifilter(lambda u: users[u]['Location'] == city, users)), cities))
You could also build indexes manually (build a dict
mapping Location to User ID) to speed things up. If this becomes too slow or unwieldy then I would probably switch to sqlite, which is now included in the Python (2.5) standard library.