My problem: I've found that processing large data sets with raw C++ using the STL map and vector can often be considerably faster (and with lower memory footprint) than using Cython.
I figure that part of this speed penalty is due to using Python lists and dicts, and that there might be some tricks to use less encumbered data structures in Cython. For example, this page (http://wiki.cython.org/tutorials/numpy) shows how to make numpy arrays very fast in Cython by predefining the size and types of the ND array.
Question: Is there any way to do something similar with lists/dicts, e.g. by stating roughly how many elements or (key,value) pairs you expect to have in them? That is, is there an idiomatic way to convert lists/dicts to (fast) data structures in Cython?
If not I guess I'll just have to write it in C++ and wrap in a Cython import.
Cython now has template support, and comes with declarations for some of the STL containers.
See http://docs.cython.org/src/userguide/wrapping_CPlusPlus.html#standard-library
Here's the example they give:
from libcpp.vector cimport vector
cdef vector[int] vect
cdef int i
for i in range(10):
vect.push_back(i)
for i in range(10):
print vect[i]