Strategy to load a set of files in Talend

PabloCocko picture PabloCocko · Jun 9, 2011 · Viewed 8.2k times · Source

I want to know which is best strategy to aboard the following problem in Talend:

  • I need to load data from a set of delimited files that are stored in a directory with names like (SAMPLE1.DAT, SAMPLE2.DAT, ... , SAMPLEX.DAT)
  • The target will be a table in a MySQL database
  • I have to load all data at once because after this task I need to work with all records in the same table

I'm a bit confused because I don't know if it possible in Talend. I was seeing the tFileInputDelimited component but I didn't find the way to solve it.

Thanks

Answer

drmirror picture drmirror · Aug 26, 2011

To read several files from one directory, you would use the tFileList component. It allows you to specify a directory and a file name pattern. All files in the directory matching the pattern will be processed, one after the other.

You need to use an "Iterate" link from the tFileList component to those components that describe what you want to do with each file. In your case, you would start with a tFileInputDelimited component (read the file) and connect the main output of that to a tMysqlOutput component. The MySQL component will, by default, just append the data to an existing table, so that should get you the result you want.

In the tFileInputDelimited component, you would not use a fixed filename, but a variable filename which is set by the tFileList component for each iteration (your loop variable, so to speak of). The name of that loop variable can be seen in the "outline" view in the studio, usually in the bottom left corner.