Postgres locks within a transaction

Simon Lepkin picture Simon Lepkin · Nov 22, 2016 · Viewed 7k times · Source

I'm having trouble understanding how locks interact with transactions in Postgres.

When I run this (long) query, I am surprised by the high degree of locking that occurs:

BEGIN;
TRUNCATE foo;
\COPY foo FROM 'backup.txt';
COMMIT;

The documentation for \COPY doesn't mention what level of lock it requires, but this post indicates that it only gets a RowExclusiveLock. But when I run this query during the \COPY:

SELECT mode, granted FROM pg_locks
WHERE relation='foo'::regclass::oid;

I get this:

mode    granted
RowExclusiveLock    true
ShareLock   true
AccessExclusiveLock true

Where the heck is that AccessExclusiveLock coming from? I assume it's coming from the TRUNCATE, which requires an AccessExclusiveLock. But the TRUNCATE finishes quickly, so I'd expect the lock to release quickly as well. This leaves me with a few questions.

When a lock is acquired by a command within a transaction, is that lock released at the end of the command (before the end of the transaction)? If so, why do I observe the above behavior? If not, why not? In fact, since transactions don't touch the table until the COMMIT, why does a TRUNCATE in a transaction need to block the table at all?

I don't see any discussion of this in the documentation for transactions in PG.

Answer

Laurenz Albe picture Laurenz Albe · Nov 23, 2016

There are a couple of misconceptions to be cleared up here.

First, a transaction does touch the table before it is committed. The comment you are quoting says that a ROLLBACK (and a COMMIT as well) don't touch the table, which is something different. They record the transaction state in the commit log (in pg_clog), and COMMIT flushes the transaction log to disk (one notable exception to this is TRUNCATE, which is relevant for your question: the old table is kept around until the end of the transaction and gets deleted during COMMIT).

If all changes were held back until COMMIT and no locks would be taken, COMMIT would be quite expensive and would routinely fail because of concurrent modifications. The transaction would have to remember the state of the database as it was before and check if the changes still apply. This way of handling concurrency is called optimistic concurreny control, and while it is a decent strategy for an application, it won't work well for a relational database, where COMMIT should be efficient and should not fail (unless there are major problems with the infrastructure).

So what relational databases use is pessimistic concurrency control or locking, i.e. they lock a database object before they access it to prevent concurrent activity from getting in their way.

Second, relational databases use two-phase locking, in which locks (at least the user-visible, so-called heavyweight locks) are always held until the end of the transaction. This is necessary (but not sufficient) to keep transactions in a logical order (serializable) and consistent. What if you release a lock, and somebody else removes the row that your inserted, but uncommitted row refers to via a foreign key constraint?

Answer to the question

The upshot of all this is that your table will keep the ACCESS EXCLUSIVE lock from TRUNCATE until the end of the transaction. Isn't it evident why that is necessary? If other transactions were allowed to even read the table after the (as of yet uncommitted) TRUNCATE, they would find it empty, since TRUNCATE really empties the table and does not adhere to MVCC semantics. Such a dirty read (of uncommitted data that might yet be rolled back) cannot be allowed.

If you really need read access to the table during the refill, you could use DELETE instead of TRUNCATE. The downside is that this is a much more expensive operation that will leave the table with a lot of “dead tuples” that have to be removed by autovacuum, resulting in a lot of empty space (table bloat). But if you are willing to live with a table and indexes that are bloated such that table and index scans will take at least twice as long, it is an option.