Reputation: 111
I have a file source where the data looks like this:
ID BarcodeNumber
------------------------
1 123456789
1 33
2 987654321
2 44
I would like to get the data as follows:
ID BarcodeNumber
------------------------
1 12345678933
2 98765432144
Any suggestions on how to accomplish this?
EDIT: I would like to do this in the data flow as this could potentially have billions of rows.
Upvotes: 4
Views: 10796
Reputation: 1505
Source File:
ID,RowNum,Barcode
1,1,123456789
1,2,33
2,1,987654321
2,2,44
Split the rows of the file. For the simple case as assumed in the source file here, we can use the conditional split. for more involved situation, we can use Script component as the splitter. From here on, the solution would be pretty standard wrt sort and merge. Once merged, add a derived column: TRIM(Barcode1) + TRIM(Barcode2) where Barcode1 is coming from the "left side" which has rows with RowNum 1, and Barcode2 from the "right side". If you are concerned about the performance of sort component, put the split rows into two tables, and compare the performance.
Cond. Split: RowNum1 (RowNum == "1"), RowNum2 (RowNum == "2")
Sort: By ID; Pass through: Barcode
Merge Join: Left outer join, Join key: ID; Output: ID, Barcode1 and Barcode2
Derived column: TRIM(Barcode1) + TRIM(Barcode2)
Upvotes: 4
Reputation: 61201
The desired state is that for each ID, all of the barcodes will be concatenated together.
To accomplish this, your data must be sorted by ID. If the file isn't sorted already, then you'll need to run it through the Sort task.
Once you have sorted data, then you will need to add a Script Transformation in Asynchronous mode. It'll be async since the number of rows in will not be the same number as rows out.
The psuedo code will be something like
If Row.ID <> LastID
If LastID <> ""
Output0Buffer.ID = LastID
Output0Buffer.Barcodes = Barcodes
LastID = Row.ID
Barcodes = ""
Barcodes = Barcodes + Row.Barcodes
Upvotes: 6