Avoiding PostgreSQL deadlocks when performing bulk update and delete operations

sanjayav picture sanjayav · Nov 19, 2014 · Viewed 7.4k times · Source

We have a single table which does not have references to any other tables.

┬────────────┬─────────────┬───────────────┬───────────────╮
│id_A(bigint)│id_B(bigint) │val_1(varchar) │val_2(varchar) │
╪════════════╪═════════════╪═══════════════╪═══════════════╡

The primary key of the table is a composite of id_A and id_B.

Reads and writes of this table are highly concurrent and the table has millions of rows. We have several stored procedures which do mass updates and deletes. Those stored procedures are being called concurrently mainly by triggers and application code.

The operations usually look like the following where it could match thousands of records to update/delete:

DELETE FROM table_name 
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)

UPDATE table_name
SET val_1 = 'some value', val_2 = 'some value'
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)

We are experiencing deadlocks and all our attempts to perform operations with locks (row level using SELECT FOR UPDATE and table level locks) do not seem to solve these deadlock issues. (Note that we cannot in any way use access exclusive locking on this table because of the performance impact)

Is there another way that we could try to solve these deadlock situations? The reference manual says:

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.

But how could we achieve this in the above scenario. Is there a guaranteed way to do bulk update inset operations in a particular order?

Answer

Erwin Brandstetter picture Erwin Brandstetter · Nov 19, 2014

Use explicit row-level locking in ordered subqueries in all competing queries. (Simple SELECT does not compete.)

DELETE

DELETE FROM table_name t
USING (
   SELECT id_A, id_B
   FROM   table_name 
   WHERE  id_A = ANY(array_of_id_A)
   AND    id_B = ANY(array_of_id_B)
   ORDER  BY id_A, id_B
   FOR    UPDATE
   ) del
WHERE  t.id_A = del.id_A
AND    t.id_B = del.id_B;

UPDATE

UPDATE table_name t
SET    val_1 = 'some value'
     , val_2 = 'some value'
FROM (
   SELECT id_A, id_B
   FROM   table_name 
   WHERE  id_A = ANY(array_of_id_A)
   AND    id_B = ANY(array_of_id_B)
   ORDER  BY id_A, id_B
   FOR    NO KEY UPDATE  -- Postgres 9.3+
-- FOR    UPDATE         -- for older versions or updates on key columns
   ) upd
WHERE  t.id_A = upd.id_A
AND    t.id_B = upd.id_B;

This way, rows are locked in consistent order as advised in the manual.

Assuming that id_A, id_B are never updated, even rare corner case complications like detailed in the "Caution" box in the manual are not possible.

While not updating key columns, you can use the weaker lock mode FOR NO KEY UPDATE. Requires Postgres 9.3 or later.


The other (slow and sure) option is to use the Serializable Isolation Level for competing transactions. You would have to prepare for serialization failures, in which case you have to retry the command.