How to select distinct values in a column in Talend

devaki picture devaki · Nov 17, 2013 · Viewed 12.4k times · Source

I am importing an excel file in Talend. I want to select all the distinct values in column "A" and then dump that data into the database. Is it possible to do that with Talend? If not, what are the alternatives available. Any help is appreciated

Answer

Julien Boulay picture Julien Boulay · Nov 17, 2013

Yes you can do that easily with Talend Open Studio.

Create a new job like this one:

enter image description here
You can replace the tOracleOutput component by the component corresponding to your database.
Then parameterize the tAggregateRow component like this :

enter image description here

Distinct values of ColumnA will be transfered to distinctColumnA in the output schema.
You can also get the number of occurences by adding a count of columnB in the operations table.