jc carmelo
jc carmelo

Reputation: 133

Transpose data using Talend

I have this kind of data:

enter image description here

I need to transpose this data into something like this using Talend:

enter image description here

Help would be much appreciated.

Upvotes: 1

Views: 6931

Answers (2)

mhassine
mhassine

Reputation: 191

dbh's suggestion should work indeed, but I did not try it.

However, I have another solution which doesn't require to change input format and is not too complicated to implement. Indeed the job has only 2 transformation components (tDenormalize and tMap).

The job looks like the following: enter image description here

Explanation :

  • Your input is read from a CSV file (could be a database or any other kind of input)
  • tDenormalize component will Denormalize your column value (column 2), based on value on id column (column 1), separating fields with a specific delimiter (";" in my case), resulting as shown in 2 rows.
  • tMap : split the aggregated column into multiple columns, by using java's String.split() method and spreading the resulting array into multiple columns. The tMap should like like this: enter image description here

Since Talend doesn't accept to store Array objects, make sure to store the splitted String in Object format. Then, cast that object into Array on the right side of the Map.

That approach should give you the expected result.

IMPORTANT:

  • tNormalize might shuffle the rows, meaning for bigger input, you might encounter unsorted output. Make sure to sort it if needed or use tDenormalizeSortedRow instead.
  • tNormalize is similar to an aggregation component meaning it scans the whole input before processing, which results into possible performance issues with particularly big inputs (tens of millions of records).
  • Your input is probably wrong (you have 5 entries with 1 as id, and 6 entries with 2 as id). 6 columns are expected meaning you should always have 6 lines per id. If not, then you should implement dbh's solution, and you probably HAVE TO add a column with a key.

Upvotes: 2

dbh
dbh

Reputation: 1627

You can use Talend's tPivotToColumnsDelimited component to achieve this. You will most likely need an additional column in your data to represent the field name.

Like "Identifier, field name, value "

Then you can use this component to pivot the data and write a file as output. If you need to process the data further, read the resulting file with tFileInoutDelimited .

See docs and an example at https://help.talend.com/display/TalendOpenStudioComponentsReferenceGuide521EN/13.43+tPivotToColumnsDelimited

Upvotes: 1

Related Questions