We have an E-commerce portal with a Postgresql 9.1 database. One very important table has at the moment 32 million records. If we want to deliver all items this table would grow to 320 million records, mostly dates. Which would be to heavy.
So we are thinking about horizontal partitioning / sharding. We can divide items in this table into 12 pieces horizontal (1 per month). What would be the best steps and technics to do so? Would horizontal partitioning within the database be good enough or do we have to start thinking about sharding?
While 320 million is not small, it's not really huge either.
It largely depends on the queries you run on the table. If you always include the partition key in your queries then "regular" partitioning would probably work.
An example for this can be found in the PostgreSQL wiki:
http://wiki.postgresql.org/wiki/Month_based_partitioning
The manual also explains some of the caveats of partitioning:
http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html
If you are thinking about sharding, you might read how Instagram (which is powered by PostgreSQL) has implemented that:
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
If you have mostly read-queries, another option might be to use streaming replication to setup multiple servers and distribute the read queries by connecting to the hot-standby for read access and connecting to the master for write access. I think pg-pool II can do that (somewhat) automatically. That can be combined with partitioning to further reduce the query runtime.
If you are adventurous and don't have very immediate needs to do so, you might also consider Postgres-XC which promises to support transparent horizontal scaling:
http://postgres-xc.sourceforge.net/
There is no final release yet, but it looks like this isn't taking too long