PostgreSQL restore and backup solution

tropikalista picture tropikalista · Feb 5, 2009 · Viewed 7.2k times · Source

A) What is the best solution for regularly backing up large PostgreSQL database (version 8.3 running on latest Ubuntu server); please don't say pg_dump with those painfully slow insert statements

B) What is the best solution for PostgreSQL database replication that works in real world

Answer

Jimmy Stenke picture Jimmy Stenke · Feb 5, 2009

I think there is only one answer to that one.

PITR, or point in time recovery. It is basically archiving of transaction logs, and is as far as I know, the best way to do backups.

I have set it up a couple of times for 8.1, but it should be the same in 8.3.

In the postgresql.conf all you need to do is to add this:

archive_command = 'test ! -f /path/to/your/backups/archive_logs/%f && cp -i %p /path/to/your/backups/archive_logs/%f </dev/null'

This command copies the archive logs to the specified directory, where you safely can back it up with the backup software of your choice.

To make a full backup, you need to first tell PostgreSQL that you're taking a backup. It is being done through the psql command psql "SELECT pg_start_backup('my_backup');" After that just copy the data dir with rsync, cpio or some other tool. If the database is heavily used, the files will change during the copy, so it is important that the tool can handle that correctly and not bail out.

After the copy is finished, just run psql "SELECT pg_stop_backup();" to tell PostgreSQL to stop it again. What those commands do is putting a marker in the Archive logs where the backup started, so in a restore, it knows from where it needs to start reading from in there.

This technique can also be used to have a warm standby for replication, but it will not be readable, just ready to take over in case of emergency. Full hot standby is planned in I think version 8.4, so until then I don't think there is another option.

One thing that is great if you use PITR, is that you can specify a timestamp to when you want the archive logs to be appended. So it can also save the database from accidents (like removing or changing some data)