How do two-phase commits prevent last-second failure?

Gili picture Gili · Oct 5, 2008 · Viewed 9.3k times · Source

I am studying how two-phase commit works across a distributed transaction. It is my understanding that in the last part of the phase the transaction coordinator asks each node whether it is ready to commit. If everyone agreed, then it tells them to go ahead and commit.

What prevents the following failure?

  1. All nodes respond that they are ready to commit
  2. The transaction coordinator tells them to "go ahead and commit" but one of the nodes crashes before receiving this message
  3. All other nodes commit successfully, but now the distributed transaction is corrupt
  4. It is my understanding that when the crashed node comes back, its transaction will have been rolled back (since it never got the commit message)

I am assuming each node is running a normal database that doesn't know anything about distributed transactions. What did I miss?

Answer

Jason Kresowaty picture Jason Kresowaty · Oct 5, 2008

No, they are not instructed to roll back because in the original poster's scenario, some of the nodes have already committed. What happens is when the crashed node becomes available, the transaction coordinator tells it to commit again.

Because the node responded positively in the "prepare" phase, it is required to be able to "commit", even when it comes back from a crash.