PostgreSQL: improving pg_dump, pg_restore performance

Joe Creighton picture Joe Creighton · Jan 19, 2010 · Viewed 57.3k times · Source

When I began, I used pg_dump with the default plain format. I was unenlightened.

Research revealed to me time and file size improvements with pg_dump -Fc | gzip -9 -c > dumpfile.gz. I was enlightened.

When it came time to create the database anew,

# create tablespace dbname location '/SAN/dbname';
# create database dbname tablespace dbname;
# alter database dbname set temp_tablespaces = dbname;

% gunzip dumpfile.gz              # to evaluate restore time without a piped uncompression
% pg_restore -d dbname dumpfile   # into a new, empty database defined above

I felt unenlightened: the restore took 12 hours to create the database that's only a fraction of what it will become:

# select pg_size_pretty(pg_database_size('dbname'));
47 GB

Because there are predictions this database will be a few terabytes, I need to look at improving performance now.

Please, enlighten me.

Answer

Ants Aasma picture Ants Aasma · Jan 19, 2010

First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn't have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.

If you are on 8.4 also experiment with parallel restore, the --jobs option for pg_restore.