Inserting lots of values (with FK's) into database using LiquiBase and Spring

Gonçalo Cardoso picture Gonçalo Cardoso · Aug 27, 2012 · Viewed 22.5k times · Source

I'm trying to add a lot of records (currently located in an Excel file) into my DB using Liquibase (so that I know how to do it for future DB changes)

My idea was to read the excel file using Java, and then fill the ChangeLogParameters from my Spring initialization class like this:

SpringLiquibase liqui = new SpringLiquibase();
liqui.setBeanName("liquibaseBean");
liqui.setDataSource(dataSource());
liqui.setChangeLog("classpath:changelog.xml");

HashMap<String, String> values = new HashMap<String, String>();
values.put("line1col1", ExcelValue1);
values.put("line1col2", ExcelValue2);
values.put("line1col3", ExcelValue3);
values.put("line2col1", ExcelValue4);
values.put("line2col2", ExcelValue5);
values.put("line2col3", ExcelValue6);
...
liqui.setChangeLogParameters(values);

The problem with this approach is that my changelog.xml would be very strange (and non productive)

<changeSet author="gcardoso" id="2012082707">
    <insert tableName="t_user">
        <column name="login" value="${ExcelValue1}"/>
        <column name="name" value="${ExcelValue2}}"/>
        <column name="password" value="${ExcelValue3}"/>
    </insert>
    <insert tableName="t_user">
        <column name="login" value="${ExcelValue4}"/>
        <column name="name" value="${ExcelValue5}}"/>
        <column name="password" value="${ExcelValue6}"/>
    </insert>
    ...
</changeSet>

Is there any way that I could do something like this:

HashMap<String, ArrayList<String>> values = new HashMap<String, ArrayList<String>>();
values.put("col1", Column1);
values.put("col2", Column2);
values.put("col3", Column3);
liqui.setChangeLogParameters(values);

<changeSet author="gcardoso" id="2012082707">
    <insert tableName="t_user">
        <column name="login" value="${Column1}"/>
        <column name="name" value="${Column2}}"/>
        <column name="password" value="${Column3}"/>
    </insert>
</changeSet>

Or is there any other way?

EDIT : My current option is to convert the Excel into a CSV file and import the data using

<changeSet author="gcardoso" id="InitialImport2" runOnChange="true">

    <loadData tableName="T_ENTITY" file="com/exictos/dbUpdate/entity.csv">
        <column header="SHORTNAME" name="SHORTNAME" />
        <column header="DESCRIPTION" name="DESCRIPTION" />
    </loadData>


    <loadData tableName="T_CLIENT" file="com/exictos/dbUpdate/client.csv">
        <column header="fdbhdf" name="ENTITYID" defaultValueComputed="(SELECT ID FROM T_ENTITY WHERE SHORTNAME = ENTITY_REFERENCE"/>
        <column header="DESCRIPTION" name="DESCRIPTION" />
    </loadData>


</changeSet>

with these CSV files:

entity.csv

SHORTNAME,DESCRIPTION
nome1,descricao1
nome2,descricao2

client.csv

DESCRIPTION,ENTITY_REFERENCE
descricaoCliente1,nome1
descricaoCliente2,nome2

But I get this error:

liquibase.exception.DatabaseException: Error executing SQL INSERT INTO `T_CLIENT` (`DESCRIPTION`, `ENTITY_REFERENCE`) VALUES ('descricaoCliente1', 'nome1'): Unknown column 'ENTITY_REFERENCE' in 'field list'

If I change the header of my client.csv to DESCRIPTION,ENTITYID I get this error:

liquibase.exception.DatabaseException: Error executing SQL INSERT INTO `T_CLIENT` (`DESCRIPTION`, `ENTITYID`) VALUES ('descricaoCliente1', 'nome1'): Incorrect integer value: 'nome1' for column 'entityid' at row 1

I any of these cases, it looks like defaultValueComputed doesn't work in the same way as valueComputed in the following example

<changeSet author="gcardoso" id="InitialImport1">

    <insert tableName="T_ENTITY">
        <column name="SHORTNAME">nome1</column>
        <column name="DESCRIPTION">descricao1</column>
    </insert>

    <insert tableName="T_CLIENT">
        <column name="ENTITYID" valueComputed="(SELECT ID FROM T_ENTITY WHERE SHORTNAME = 'nome1')"/>
        <column name="DESCRIPTION">descricaoCliente</column>
    </insert>

</changeSet>

Is this the expected behavior? Bug of LiquiBase? Or just me doing something wrong (the most likely) ?

Or is there any other way to import massive amount of data? But always using LiquiBase and/or Spring.

EDIT2 : My problem is that I'm unable to insert the data into the second table with the correct foreign key

Answer

Barry Pitman picture Barry Pitman · Aug 31, 2012

I would say that Liquibase is not the ideal tool for what you want to achieve. Liquibase is well-suited to managing the database structure, not the database's data.

If you still want to use Liquibase to manage the data, you have a couple of options (see here) -

  1. Record your insert statements as SQL, and refer to them from changelog.xml like this:

    <sqlFile path="/path/to/file.sql"/>

  2. Use a Custom Refactoring Class which you refer to from the changelog.xml like this:

    <customChange class="com.example.YourJavaClass" csvFile="/path/to/file.csv"/>

    YourJavaClass would read the records from the CSV file, and apply them to the database, implementing this method:

    void execute(Database database) throws CustomChangeException;

Bear in mind, that once you have loaded this data via Liquibase, you shouldn't modify the data in the file, because those changes won't be re-applied. If you want to make changes to it, you would have to do it in subsequent changesets. So after a while you might end up with a lot of different CSV files/liquibase changesets, all operating on the same/similar data (this depends on how you are going to use this data - will it ever change once inserted?).

I would recommend looking at using DBUnit for managing your reference data. Its a tool primarily used in unit testing, but it is very mature, suitable for use in production I would say. You can store information in CSV or XML. I would suggest using a Spring 'InitializingBean' to load the dataset from the classpath and perform a DBUnit 'refresh' operation, which will, from the docs:

This operation literally refreshes dataset contents into the database. This means that data of existing rows is updated and non-existing row get inserted. Any rows which exist in the database but not in dataset stay unaffected.

This way, you can keep your reference data in one place, and add to it over time so that there is only one source of the information, and it isn't split across multiple Liquibase changesets. Keeping your DBUnit datasets in version control would provide trace-ability, and as a bonus, DBUnit datasets are portable across databases, and can manage things like insert order to prevent foreign key violations for you.