Is there a dbunit-like framework that doesn't suck for java/scala?

egervari picture egervari · Oct 16, 2010 · Viewed 27.1k times · Source

I was thinking of making a new, light-weight database population framework. I absolutely hate dbunit. Before I do, I want to know if someone already did it.

Things i dislike about dbunit:

1) The simplest format to write and get started is deprecated. They want you to use formats that are bloated. Some even require xml schemas. Yeah, whatever.

2) They populate rows not in the order you write them, but in the order tables are defined in the xml file. This is really bad because you can't order your data in such a way that foreign key constraints won't cause problems. This just forces you to go through the hassle of turning them off altogether.

This also wastes time and bloats up your junit base classes to include code to disable the foreign key constraints. You will probably have to test for the database type (hsqldb, etc.) and disable them in database-specific ways. This is way bad.

It could be better if dbunit helped in disabling foreign key constraints as part of their framework automatically, but they don't do this. They do keep track of dialects... so why not use them for this? Ultimately, all of this does is force the programmer to waste time and not get up and testing quickly.

3) XML is a pain to write. I don't need to say more about this. They also offer so many ways to do it, that I think it just complicates matters. Just offer one really solid way and be done with it.

4) When your data gets large, keeping track of the ids and their consistent/correct relationships is a royal pain.

Also, if you don't work on a project for a month, how are you to remember that user_id 1 was an admin, user_id 2 was a business user, user_id 3 was an engineer and user_id 4 was something else? Going back to check this is wasting more time. There should be a meaningful way to retrieve it other than an arbitrary number.

5) It's slow. I've found that unless hsqldb is used, it is painfully slow. It doesn't have to be. There are also numerous ways to mess up its configuration as it is not easy to do "out of the box". There is a hump that you must go through to get it working right. All this does is encourage people to not use it, or be pissed of when they do start to use it.

6) Some values tend to repeat a lot, likes dates. It'd be nice to specify defaults, or even have the framework put defaults in automatically, even without you telling it to put defaults in there. That way you can create objects just with the values you want, and leave the rest off. This sure beats specifying every nook and cranny of a column if it's not required.

7) Probably the most annoying thing is that the first entry must include ALL the values - even null placeholders - or future rows won't pick the columns that you actually specified.

DBunit doesn't have a sensible default for translating [NULL] to a real null value either. You have to manually add it. Tell me, who hasn't done this with dbunit? Everyone has. It shouldn't be like this!

What this means is that if you have a polymorphic object, you must declare all the foreign keys to the joining tables of each subclass in the first row, even though they are null. If you do a table for all subclasses pattern, you still have to specify all the fields on the first row. This is just awful.

Anything out there to satisfy me, or should I become the next framework developer of a much better database testing framework?

Answer

Pascal Thivent picture Pascal Thivent · Oct 16, 2010

I'm not aware of any real alternative to DbUnit and none of the tools mentioned by @Joe are in my eyes:

  • Incanto: not DB agnostic
  • SQLUnit: a regression and unit testing harness for testing database stored procedures (that's not what DbUnit is about)
  • Cactus: a tool for In-container testing (I fail to see where it helps with databases)
  • Liquibase: a database migration tool (doesn't load/verify data)
  • ORMUnit: can initialize a database but that's all
  • JMock: doesn't compete with DbUnit at all

That being said, I've personally used DbUnit successfully several times, on small and huge projects, and I find it pretty usable, especially when using Unitils and its DbUnit module. This doesn't mean it's perfect and can't be improved but with decent tooling (either custom made or something like Unitils), using it has been a decent experience.

So let me answer some of your points:

  1. The simplest format to write and get started is deprecated. They want you to use formats that are bloated. Some even require xml schemas. Yeah, whatever.

DbUnit supports flat or structured XML, XLS, CSV. What revolutionary format would you like to use? By the way, a DTD or schema is not mandatory when using XML. But it gives you nice things like validation and auto-completion, how is that bad? And Unitils can generate it easily for you, see Generate an XSD or DTD of the database structure.

It could be better if dbunit helped in disabling foreign key constraints as part of their framework automatically, but they don't do this. They do keep track of dialects... so why not use them for this? Ultimately, all of this does is force the programmer to waste time and not get up and testing quickly.

They are waiting for your patch.

Meanwhile, Unitils provides support to handle constraints transparently, see Disabling constraints and updating sequences.

  1. XML is a pain to write. I don't need to say more about this. They also offer so many ways to do it, that I think it just complicates matters. Just offer one really solid way and be done with it.

I guess pain is subjective but I don't find it painful, especially when using a schema and autocompletion. What is the silver bullet you're suggesting?

  1. When your data gets large, keeping track of the ids and their consistent/correct relationships is a royal pain.

Keep them small, that's a know best practice. You're going against a known best practice and then complain...

Also, if you don't work on a project for a month, how are you to remember that user_id 1 was an admin, user_id 2 was a business user, user_id 3 was an engineer and user_id 4 was something else? Going back to check this is wasting more time. There should be a meaningful way to retrieve it other than an arbitrary number.

Yes, task switching is counter productive. But since you're working with low level data, you have to know how they are represented, there is no magic solution unless you use a higher level API of course (but that's not the purpose of DbUnit).

  1. It's slow. I've found that unless hsqldb is used, it is painfully slow. It doesn't have to be. There are also numerous ways to mess up its configuration as it is not easy to do "out of the box". There is a hump that you must go through to get it working right. All this does is encourage people to not use it, or be pissed of when they do start to use it.

That's inherent to databases and JDBC, not DbUnit. Use a fast database like H2 if you want things to be as fast as possible (if you have a better agnostic way to do things, I'd be glad to learn about it).

  1. Probably the most annoying thing is that the first entry must include ALL the values - even null placeholders - or future rows won't pick the columns that you actually specified.

Not when using Unitils as mentioned in presentations like Unitils - Home - JavaPolis 2008 or Unit testing: unitils & dbmaintain.

Anything out there to satisfy me, or should I become the next framework developer of a much better database testing framework?

If you think you can make things better, maybe contribute to existing solutions. If that's not possible and if you think you can create the killer database testing framework, what can I say, do it. But don't forget, ranting is easy, coming up with solutions using your own solutions is less so.