What is the 'best' way to do distributed transactions across multiple databases using Spring and Hibernate

Vihung picture Vihung · Sep 24, 2008 · Viewed 38.7k times · Source

I have an application - more like a utility - that sits in a corner and updates two different databases periodically.

It is a little standalone app that has been built with a Spring Application Context. The context has two Hibernate Session Factories configured in it, in turn using Commons DBCP data sources configured in Spring.

Currently there is no transaction management, but I would like to add some. The update to one database depends on a successful update to the other.

The app does not sit in a Java EE container - it is bootstrapped by a static launcher class called from a shell script. The launcher class instantiates the Application Context and then invokes a method on one of its beans.

What is the 'best' way to put transactionality around the database updates?

I will leave the definition of 'best' to you, but I think it should be some function of 'easy to set up', 'easy to configure', 'inexpensive', and 'easy to package and redistribute'. Naturally FOSS would be good.

Answer

Aaron Digulla picture Aaron Digulla · May 20, 2009

The best way to distribute transactions over more than one database is: Don't.

Some people will point you to XA but XA (or Two Phase Commit) is a lie (or marketese).

Imagine: After the first phase have told the XA manager that it can send the final commit, the network connection to one of the databases fails. Now what? Timeout? That would leave the other database corrupt. Rollback? Two problems: You can't roll back a commit and how do you know what happened to the second database? Maybe the network connection failed after it successfully committed the data and only the "success" message was lost?

The best way is to copy the data in a single place. Use a scheme which allows you to abort the copy and continue it at any time (for example, ignore data which you already have or order the select by ID and request only records > MAX(ID) of your copy). Protect this with a transaction. This is not a problem since you're only reading data from the source, so when the transaction fails for any reason, you can ignore the source database. Therefore, this is a plain old single source transaction.

After you have copied the data, process it locally.