I have a medium-amount of base objects.
These base objects will be put in collections, and these collections will be munged around: sorted, truncated, etc.
Unfortunately, the n is large enough that memory consumption is slightly worrisome, and speed is getting concerning.
My understanding is that tuples are slightly more memory-efficient, since they are deduplicated.
Anyway, I would like to know what the cpu/memory tradeoffs of lists vs. tuples are in Python 2.6/2.7.
If you have a tuple and a list with the same elements, the tuple takes less space. Since tuples are immutable, you can't sort them, add to them, etc. I recommend watching this talk by Alex Gaynor for a quick intro on when to choose what datastructure in Python.
UPDATE: Thinking about it some more, you may want to look into optimizing the space usage of your objects, e.g., via __slots__
or using namedtuple
instances as proxies instead of the actual objects. This would likely lead to much bigger savings, since you have N of them and (presumbaly) only a few collections in which they appear. namedtuple
in particular is super awesome; check out Raymond Hettinger's talk.