We have a single table which does not have references to any other tables.
┬────────────┬─────────────┬───────────────┬───────────────╮
│id_A(bigint)│id_B(bigint) │val_1(varchar) │val_2(varchar) │
╪════════════╪═════════════╪═══════════════╪═══════════════╡
The primary key of the table is a composite of id_A and id_B.
Reads and writes of this table are highly concurrent and the table has millions of rows. We have several stored procedures which do mass updates and deletes. Those stored procedures are being called concurrently mainly by triggers and application code.
The operations usually look like the following where it could match thousands of records to update/delete:
DELETE FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
UPDATE table_name
SET val_1 = 'some value', val_2 = 'some value'
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
We are experiencing deadlocks and all our attempts to perform operations with locks (row level using SELECT FOR UPDATE
and table level locks) do not seem to solve these deadlock issues. (Note that we cannot in any way use access exclusive locking on this table because of the performance impact)
Is there another way that we could try to solve these deadlock situations? The reference manual says:
The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.
But how could we achieve this in the above scenario. Is there a guaranteed way to do bulk update inset operations in a particular order?
Use explicit row-level locking in ordered subqueries in all competing queries. (Simple SELECT
does not compete.)
DELETE FROM table_name t
USING (
SELECT id_A, id_B
FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
ORDER BY id_A, id_B
FOR UPDATE
) del
WHERE t.id_A = del.id_A
AND t.id_B = del.id_B;
UPDATE table_name t
SET val_1 = 'some value'
, val_2 = 'some value'
FROM (
SELECT id_A, id_B
FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
ORDER BY id_A, id_B
FOR NO KEY UPDATE -- Postgres 9.3+
-- FOR UPDATE -- for older versions or updates on key columns
) upd
WHERE t.id_A = upd.id_A
AND t.id_B = upd.id_B;
This way, rows are locked in consistent order as advised in the manual.
Assuming that id_A
, id_B
are never updated, even rare corner case complications like detailed in the "Caution" box in the manual are not possible.
While not updating key columns, you can use the weaker lock mode FOR NO KEY UPDATE
. Requires Postgres 9.3 or later.
The other (slow and sure) option is to use the Serializable Isolation Level for competing transactions. You would have to prepare for serialization failures, in which case you have to retry the command.