It's not entirely clear to me what transactions in database systems do. I know they can be used to rollback a list of updates completely (e.g. deduct money on one account and add it to another), but is that all they do? Specifically, can they be used to prevent race conditions? For example:
// Java/JPA example
em.getTransaction().begin();
User u = em.find(User.class, 123);
u.credits += 10;
em.persist(u); // Note added in 2016: this line is actually not needed
em.getTransaction().commit();
(I know this could probably be written as a single update query, but that's not alway the case)
Is this code protected against race conditions?
I'm mostly interested in MySQL5 + InnoDB, but general answers are welcome too.
TL/DR: Transactions do not inherently prevent all race conditions. You still need locking, abort-and-retry handling, or other protective measures in all real-world database implementations. Transactions are not a secret sauce you can add to your queries to make them safe from all concurrency effects.
What you're getting at with your question is the I in ACID - isolation. The academically pure idea is that transactions should provide perfect isolation, so that the result is the same as if every transaction executed serially. In reality that's rarely the case in real RDBMS implementations; capabilities vary by implementation, and the rules can be weakened by use of a weaker isolation level like READ COMMITTED
. In practice you cannot assume that transactions prevent all race conditions, even at SERIALIZABLE
isolation.
Some RDBMSs have stronger abilities than others. For example, PostgreSQL 9.2 and newer have quite good SERIALIZABLE
isolation that detects most (but not all) possible interactions between transactions and aborts all but one of the conflicting transactions. So it can run transactions in parallel quite safely.
Few, if any3, systems have truly perfect SERIALIZABLE
isolation that prevents all possible races and anomalies, including issues like lock escalation and lock ordering deadlocks.
Even with strong isolation some systems (like PostgreSQL) will abort conflicting transactions, rather than making them wait and running them serially. Your app must remember what it was doing and re-try the transaction. So while the transaction has prevented concurrency-related anomalies from being stored to the DB, it's done so in a manner that is not transparent to the application.
Arguably the primary purpose of a database transaction is that it provides for atomic commit. The changes do not take effect until you commit the transaction. When you commit, the changes all take effect at the same instant as far as other transactions are concerned. No transaction can ever see just some of the changes a transaction makes1,2. Similarly, if you ROLLBACK
, then none of the transaction's changes ever get seen by any other transaction; it's as if your transaction never existed.
That's the A in ACID.
Another is durability - the D in ACID. It specifies that when you commit a transaction it must truly be saved to storage that will survive a fault like power loss or a sudden reboot.
See wikipedia
Rather than using locking and/or high isolation levels, it's common for ORMs like Hibernate, EclipseLink, etc to use optimistic concurrency control (often called "optimistic locking") to overcome the limitations of weaker isolation levels while preserving performance.
A key feature of this approach is that it lets you span work across multiple transactions, which is a big plus with systems that have high user counts and may have long delays between interactions with any given user.
In addition to the in-text links, see the PostgreSQL documentation chapter on locking, isolation and concurrency. Even if you're using a different RDBMS you'll learn a lot from the concepts it explains.
1I'm ignoring the rarely implemented READ UNCOMMITTED
isolation level here for simplicity; it permits dirty reads.
2As @meriton points out, the corollary isn't necessarily true. Phantom reads occur in anything below SERIALIZABLE
. One part of an in-progress transaction doesn't see some changes (by a not-yet-committed transaction), then the next part of the in-progress transaction does see the changes when the other transaction commits.
3 Well, IIRC SQLite2 does by virtue of locking the whole database when a write is attempted, but that's not what I'd call an ideal solution to concurrency issues.