Reputation: 11
I have a source flat file with comma separated,file has header with start date and end date , the next line has all the filed names and continued with the data example of source file:
"2015-05-09","2015-06-05"
"CARRIER ","CONTRACT ","ELGBL/GRP "," ","TOTAL FEES","TOTAL FEES ","PMPM ACA ","PMPM ACA ","RETAIL ","RETAIL ","MEDICAID ","MEDICAID ","STATE ","STATE ","MAIL SERVICE","MAIL SERVICE","RETAIL ","RETAIL ","POSTAGE - ","POSTAGE - ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","SYSTEM ","OTHER ","OTHER "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," ","
" "," "," "," ","COUNT ","AMOUNT ","METAL ","METAL ","DIRECTS ","DIRECTS "," "," ","HEALTH ","HEALTH ","CLAIMS ","CLAIMS ","PHARMACY ","PHARMACY ","BULK ","BULK ","MCRCF ","MCRCF ","MCRCP ","MCRCP ","MPA ","MPA ","MRXC ","MRXC ","PPACA ","PPACA ","QPC ","QPC ","RXSEL ","RXSEL ","SPCR ","SPCR ","COUNT ","AMOUNT "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," ","
" "," "," "," "," "," ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT ","COUNT ","AMOUNT "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," "," ","
"1234 ","ABCD5678 ","ABCOEFITEST1 "," "," 1"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200074002 "," "," 6"," 3.20"," 2"," 3.20"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 4"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200178002 "," "," 2"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 1"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200197001 "," "," 1"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200204002 "," "," 2"," 3.20"," 2"," 3.20"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200204003 "," "," 2"," 3.20"," 2"," 3.20"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200232002 "," "," 4"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 3"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200300001 "," "," 7"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 6"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200345002 "," "," 1"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200414001 "," "," 9"," 3.20"," 2"," 3.20"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 7"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
"1234 ","ABCD5678 ","ABC200491003 "," "," 1"," 1.60"," 1"," 1.60"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00"," 0"," 0.00","
so now I am reading it as one long string with 1000 precision and then I want to use Java transformation and split them with column names which are spread across multiple header, and associate them with the values underneath. One input row will land as many target rows.
first thing:Read file line by line into a long string port.
Add "CurrentlyProcessedFileName" port to the source to capture the filename.
Count the number of rows read
Split the row on commas-- how do I implement this ?? guessing expression or java transformation and the below whole process in java or can I use SP , bu how do I split the row on commas??
If Current RowNumber = 1 Then
vINVC_BGN_DT := field1
vINVC_END_DT := field2
If Current RowNumber > 1
If field1 != $$CARRIER Then
NAME1 := NAME1 || field1 #Initialize NAME variables to an empty string
NAME2 := NAME2 || field2
NAME3 := NAME3 || field3 #repeat for each comma seperated field
If field1 = $$CARRIER Then
VALUE1 := field1
VALUE2 := field2
VALUE3 := field3 #repeat for each comma seperated field
For each source row, the number of output rows should equal the count of distinct Amount_types, (excluding empty ones)
For i in NAME4.NAMEx loop
if NAMEi is not spaces or null then generate row
target.INVC_TYPE = "CLAIM"
target.FILE_RECEIPT_SK
target.CARRIER := VALUE1
target.CNTRCT := VALUE2
target.PBM_GROUP := VALUE3
target.INVC_BGN_DT := vINVC_BGN_DT #From row1
target.INVC_END_DT := vINVC_END_DT #From row1
target.INVC_AMOUNT_TYPE := NAMEi
target.INVC_AMOUNT := VALUEi
end i loop..
Upvotes: 0
Views: 997
Reputation: 3455
You can use a normalizer to split one row into multiple rows. In your case, you have to seperately process the first row, next 3 header rows and the data rows. For that create a seq no in an expression using a variable.
You can get the begin and end date in the expression itself: Ex.
begin_date:= iif(seq=1, field1, begin_date)
For seperating the header rows and data rows, use a rouer with conditions like seq>1 && seq<=4
and seq>4
Now you can use a normalizer to split the row into multiple rows. For the data rows create 5 columns for carrier, contract, pbm_group, count and amount and set the occurrence of count and amount to the total no. or count or amount columns.
You can use another normalizer for the header columns similarly. Finally join the two flows using joiner transformation.
You can also use a java transformation, in the on input row tab, you have write a similar logic. For splitting the row you can use the split method. Ex:
String[] fields=row.split(",");
Now the array fields
would contain all the fields as array elements. You can assign appropriate elements to output ports and call the Informatica's generateRow()
method to create a target record.
Upvotes: 0