abierto
abierto

Reputation: 1467

Talend - generating n multiple rows from 1 row

Background: I'm using Talend to do something (I guess) that is pretty common: generating multiple rows from one. For example:

ID | Name | DateFrom | DateTo

01 | Marco| 01/01/2014 | 04/01/2014

...could be split into:

new_ID | ID | Name | DateFrom | DateTo

01 | 01 | Marco | 01/01/2014 | 02/01/2014

02 | 01 | Marco | 02/01/2014 | 03/01/2014

03 | 01 | Marco | 03/01/2014 | 04/01/2014

The number of outcoming rows is dynamic, depending on the date period in the original row.

Question: how can I do this? Maybe using tSplitRow? I am going to check those periods with tJavaRow. Any suggestions?

Upvotes: 5

Views: 11548

Answers (3)

psychoslave
psychoslave

Reputation: 3041

I came here as I wanted to add all context parameters into an Excel data sheet. So the solution bellow works when you are taking 0 input lines, but can be adapted to generate several lines for each line in input.

The design is actually straight forward:

tJava –trigger-on-OK→ tFileInputDelimited  → tDoSomethingOnRowSet
  ↓                          ↑
[write into a CSV]     [read the CSV]

And here is the kind of code structure usable in the tJava.

try {
  StringBuffer wad = new  StringBuffer();
  
  wad.append("Key;Nub"); // Header
  context.stringPropertyNames().forEach(
    key -> wad.
      append(System.getProperty("line.separator")).
      append(key + ";" + context.getProperty(key) ) 
  );
  
  // Here context.metadata contains the path to the CSV file
  FileWriter output = new FileWriter(context.metadata);
  output.write(wad.toString());
  output.close();
} catch (IOException mess) {
  System.out.println("An error occurred.");
  mess.printStackTrace();
}

Of course if you have a set of rows as input, you can adapt the process to use a tJavaRow instead of a tJava.

You might prefer to use an Excel file as an on disk buffer, but dealing with this file format asks more work at least the first time when you don’t have the Java libraries already configured in Talend. Apache POI might help you if you nonetheless chose to go this way.

Upvotes: 0

Garlando
Garlando

Reputation: 196

Expanding on the answer given by Balazs Gunics

Your first part is to calculate the number of rows one row will become, easy enough with a date diff function on the to and from dates

overall view

calculate number of rows per row


Part 2 is to pass that value to a tFlowToIterate, and pick it up with a tJavaFlex that will use it in its start code to control a for loop:

tJavaFlex start:

int currentId = (Integer)globalMap.get("out1.id");
String currentName = (String)globalMap.get("out1.name");
Long iterations = (Long)globalMap.get("out1.iterations");
Date dateFrom = (java.util.Date)globalMap.get("out1.dateFrom");

for(int i=0; i<((Long)globalMap.get("out1.iterations")); i++) { 

Main

  row2.id = currentId;
  row2.name = currentName;
  row2.dateFrom = TalendDate.addDate(dateFrom, i, "dd");
  row2.dateTo = TalendDate.addDate(dateFrom, i+1, "dd");

End

}

and sample output:

1|Marco|01-01-2014|02-01-2014
1|Marco|02-01-2014|03-01-2014
1|Marco|03-01-2014|04-01-2014
2|Polo|01-01-2014|02-01-2014
2|Polo|02-01-2014|03-01-2014
2|Polo|03-01-2014|04-01-2014
2|Polo|04-01-2014|05-01-2014
2|Polo|05-01-2014|06-01-2014
2|Polo|06-01-2014|07-01-2014
2|Polo|07-01-2014|08-01-2014
2|Polo|08-01-2014|09-01-2014
2|Polo|09-01-2014|10-01-2014
2|Polo|10-01-2014|11-01-2014
2|Polo|11-01-2014|12-01-2014
2|Polo|12-01-2014|13-01-2014
2|Polo|13-01-2014|14-01-2014
2|Polo|14-01-2014|15-01-2014
2|Polo|15-01-2014|16-01-2014
2|Polo|16-01-2014|17-01-2014
2|Polo|17-01-2014|18-01-2014
2|Polo|18-01-2014|19-01-2014
2|Polo|19-01-2014|20-01-2014
2|Polo|20-01-2014|21-01-2014
2|Polo|21-01-2014|22-01-2014
2|Polo|22-01-2014|23-01-2014
2|Polo|23-01-2014|24-01-2014
2|Polo|24-01-2014|25-01-2014
2|Polo|25-01-2014|26-01-2014
2|Polo|26-01-2014|27-01-2014
2|Polo|27-01-2014|28-01-2014
2|Polo|28-01-2014|29-01-2014
2|Polo|29-01-2014|30-01-2014
2|Polo|30-01-2014|31-01-2014
2|Polo|31-01-2014|01-02-2014

Upvotes: 14

Balazs Gunics
Balazs Gunics

Reputation: 2077

You can use tJavaFlex to do this.

If you have a small amount of columns the a tFlowToIterate -> tJavaFlex options could be fine.

In the begin part you can start to iterate, and in the main part you assign values to the output schema. If you name your output is row6 then:

row6.id = (String)globalMap.get("id");

and so on.

Upvotes: 0

Related Questions