Write JSON into a field using Talend Open Studio

Elorfin picture Elorfin · Dec 12, 2013 · Viewed 8.8k times · Source

I try to migrate data an old database into our new application.

In process, I need to grab data from old db to create a JSON which must be stored in a field in the new MySQL db.

So I use the components tWriteJSONField and tExtractJSONFields.

In tWriteJSONField, my XML tree looks like this :

path
|-- id [loop element]
|-- name
|-- description

N.B. : I can't find how to use loop element and group element properties. I don't understand how it works and the documentation doesn't talk about this.

The component tWriteJSONField is linked to a tExtractJSONFields in order to extract id from JSON. I need this to know to each record JSON must be linked.

tExtractJSONFields configuration : XPath request

"/path"

tExtractJSONFields configuration : Mapping

-----------------------------------------------
| column        | XPath request | get nodes ? |
-----------------------------------------------
| idForm        | "id"          | false       |
-----------------------------------------------
| jsonStructure | "*"           | yes         |
-----------------------------------------------

My problem is in jsonStructure output by tExtractJSONField, I only get the first child of my root tag. In my case jsonStructure looks like this :

{
   "id": "123"
}

Expected result is :

{
   "id": "123",
   "name": "Test",
   "description": "Test"
}

If I declare the child name before id for example, I will get :

{
   "name": "Test"
}

I have tried to change the XPath query for jsonStructure but I never get all the fields.

Why ?

It's my first question about Talend, so if it lacks information, let me know in the comments.

Thanks for help.

EDIT :

Data from tMysqlInput to tWriteJSONField :

N.B. : My flux contains more columns but I only show you which are used to create JSON.

---------------------------------------------------------------------------------------
| IdForm | NomForm                    | DescrForm                                     |
---------------------------------------------------------------------------------------
| 1      | English training           | <p>This is a description of the training</p>  |
---------------------------------------------------------------------------------------
| 2      | French training            | <p>This contains HTML tags from a WYSIWYG</p> |
---------------------------------------------------------------------------------------
| 3      | How to use the application | <p>Description</p>                            |
---------------------------------------------------------------------------------------

In tWriteJSONField, columns are mapped to the JSON like this :

path
|-- id [loop element] --> IdForm
|-- name              --> NomForm
|-- description       --> DescrForm

tWriteJSONField output a new flux with the same columns as the input (although, this columns are all empty in the output even if they were populated in input) and add a new one jsonStructure which contains generated JSON.

This new flux is caught by a tExtractJSONFields (configuration for this component is available in my original post).

tExtractJSONFields outputs this flux :

--------------------------
| IdForm | jsonStructure |
--------------------------
| 1      | { "id": "1" } |
--------------------------
| 2      | { "id": "2" } |
--------------------------
| 3      | { "id": "3" } |
--------------------------

And I expect it returns this one :

--------------------------------------------------------------------------------------------
| IdForm | jsonStructure                                                                   |
--------------------------------------------------------------------------------------------
| 1      | { "id": "1", "name": "English training", "description": "<p>This is[...]</p>" } |
--------------------------------------------------------------------------------------------
| 2      | { "id": "2", "name": "French training", "description": "<p>[...]</p>" }         |
--------------------------------------------------------------------------------------------
| 3      | { "id": "3", "name": "How to use the [...]", "description": "<p>[...]</p>" }    |
--------------------------------------------------------------------------------------------

EDIT 2

I use TOS 5.4.0.r110020 if it can help.

Answer

Julien Boulay picture Julien Boulay · Dec 13, 2013

Your XPath request for JSONStructure column is not correct. Just remove "*" and you will get the expected result.

XPathQuery

Also, if you don't need the root node in the json entry, just check "Remove root node" on tWriteJsonField and change Loop XPath Query to "/" in tExtractJSONFields

Remove Root Node on tExtractJSONFields