What are the conditions for encountering a serialization failure?

Daniel Trebbien picture Daniel Trebbien · Oct 9, 2011 · Viewed 10k times · Source

The PostgreSQL manual page on the Serializable Isolation Level states:

[Like] the Repeatable Read level, applications using this level must be prepared to retry transactions due to serialization failures.

What are the conditions for encountering a serialization failure at the Repeatable Read or Serializable levels?

I tried to induce a serialization failure with two instances of psql running, but even though a transaction was committed by one instance, the other instance, inside a serializable-level transaction while the other committed, was successful in committing its changes. Both simply inserted records into a table, so perhaps I need to try something more complex.

Basically I am trying to understand what happens in the event of a serialization failure and how serialization failures arise.

Answer

kgrittn picture kgrittn · Apr 3, 2012

There are many possible causes for serialization failures. Technically, a deadlock between two transactions is one form of serialization failure, and can potentially happen at any isolation level if there are concurrent modifications to the schema (database structure). Since you're asking about PostgreSQL, you should be aware that in PostgreSQL this type of serialization failure gets a separate SQLSTATE from the the others: '40P01'. All other serialization failures return '40001'. The rest of this answer will focus on these non-deadlock varieties in PostgreSQL.

Outside of a live replica ("hot standby") these can only occur at the two stricter isolation levels: REPEATABLE READ and SERIALIZABLE. At the REPEATABLE READ level these can only occur because of write conflicts -- two concurrent transactions try to update or delete the same (existing) row. The first transaction to make the attempt locks the row and proceeds. If it commits, the second transaction fails with a serialization failure. If the first transaction rolls back for any reason, the blocked transaction is freed to proceed and will acquire its own lock on the row. This behavior, in combination with a single "snapshot" for the duration of the transaction, is also known as SNAPSHOT ISOLATION.

Prior to PostgreSQL version 9.1, SERIALIZABLE transactions worked exactly the same way. Starting with 9.1 PostgreSQL uses a new technique called Serializable Snapshot Isolation to ensure that the behavior of any set of serializable transactions is fully consistent with some serial (one-at-a-time) execution of those transactions. When using SERIALIZABLE transactions in 9.1, your application should be prepared for serialization failures on any statement except for ROLLBACK -- even in read only transactions and even on COMMIT. For more information, see the PostgreSQL doc page at http://www.postgresql.org/docs/current/interactive/transaction-iso.html or the Wiki page giving examples of how serialization failures can occur in the new, stricter isolation level at http://wiki.postgresql.org/wiki/SSI

If you are using the Hot Standby feature, you can get a serialization failure on the read-only replica if there is a long-running query for which maintaining a stable view of the data would require the database to forestall replication for too long. There are configuration settings to allow you to balance "freshness" of the replicated data against tolerance for long-running queries. Some users may want to create more than one replica so that they can have up-to-date data (possibly even choosing synchronous replication) while allowing another to lag as needed to service long-running queries.

Edit to provide another link: The paper titled Serializable Snapshot Isolation in PostgreSQL, presented at the 38th International Conference on Very Large Databases provides more detail and perspective than the other links, along with references to the papers which laid the groundwork for this implementation.