SSIS need to add same rownumber to rows, then reset when it finds a certain value in a row, from file source

Question

I have a file source where the data is not in normalized form with any sort of primary key value or repeating group value. I am using Merge Join to put the multiple rows into one merged row. I need to apply some row numbering so that I have a join between the multiple rows, to get them into the one single row for the merge join.

Here is what the source data looks like:

Data Rows:
MSH|BLAH|||BLAHBLAH15|BLAHZ|||
EVN|MOREBLAH|BLAHBLAH11|BLAHY|||
PID|BLAHXX|BLAHBLAH655|BLAHX|||
PV1|BLAHX2|BLAHBLAH42|BLAHX|||||||||
DG1|1||84|XXXX||A
IN1|1||11400|TEST
IN1|2||20100|TEST2
MSH|BLAH2|BLAHBLAH5|BLAHZ|||
EVN|BLAH6|20220131123100
PID|BLAHGG|BLAH222|BLAHX|||
PV1|PV1|BLAHX2|BLAHBLAH42|BLAHX||||||||20220101|
DG1|1||84|XXXX||A
DG1|2||84|XXXX||A
IN1|1||11600|TEST2

What is consistent is that there is always an MSH line as the header, and everything below it belongs with the MSH line at the top.

So I'm trying to accomplish this by applying a row numbering as below, where it goes from 1,1,1,1 to 2,2,2,2,2 incrementing by one wherever it finds the MSH line, as per below:

Data Rows:                            Numbering Needed:
MSH|BLAH|||BLAHBLAH15|BLAHZ|||        1
EVN|MOREBLAH|BLAHBLAH11|BLAHY|||      1
PID|BLAHXX|BLAHBLAH655|BLAHX|||       1
PV1|BLAHX2|BLAHBLAH42|BLAHX|||||||||  1
DG1|1||84|XXXX||A                     1
IN1|1||11400|TEST                     1
IN1|2||20100|TEST2                    1
MSH|BLAH2|BLAHBLAH5|BLAHZ|||          2
EVN|BLAH6|20220131123100              2
PV1|PV1|BLAHX2|BLAHBLAH42|BLAHX|||||| 2
DG1|1||84|XXXX||A                     2
DG1|2||84|XXXX||A                     2
IN1|1||11600|TEST2                    2

I can't use a specific row count to reset the number, ie: Every 5 rows increment the row numbering, because it's an inconsistent number of rows each time. In the example above, the first set is 7 rows and the 2nd set is 6 rows. I have to do my incrementing by the presence of the "MSH" row value, and apply the same number on down until it finds the next "MSH". I know that have to use a script task (preferably in C#) to generate this row number since my source is a file. But I just can't seem to find the right logic that will do this, since my data doesn't have a repeating key for each row that I can partition by.

SSIS need to add same rownumber to rows, then reset when it finds a certain value in a row, from file source

Answers (1)

Related Questions