rikta
rikta

Reputation: 13

Need to add header and trailer record in a csv file - Azure Data factory

I am new to azure data factory and need to implement below logic using Azure Data Factory where we are transferring a csv file from source to destination with some transformation in the file.

Input file contains below data :

111|101|2019-02-04 21:04:57

222|202|2019-02-04 21:33:54

333|202|2019-02-04 20:23:55

Expected Output :

H|TestFile|currentDateTime------------ Need to add this header record. H and TestFile would be static

111|101|2019-02-04 21:04:57

222|202|2019-02-04 21:33:54

333|202|2019-02-04 20:23:55

T|03-------------------------------------- T is static value. Need to add total number of records here.

Can someone please help with this

Upvotes: 1

Views: 2269

Answers (1)

Joseph  Xu
Joseph Xu

Reputation: 6043

Update:
After my series of tests, the final result I can get is as follows: enter image description here
The structure overview is as follows: enter image description here

  1. I saved the header into a txt file.
    enter image description here

  2. source1 stores the source csv file, I set column name as Column_1 at Projection tab. enter image description here The source1 data preview is as follows: enter image description here

  3. At SurrogateKey1 activity, I key in Row_No as Key column and 1 as Start value. enter image description here

  4. At Window1 activity, select Row_No as Window column, then enter expression max(Row_No). enter image description here Window1 data preview is as follows, I can get the max value of the Row_No. enter image description here

  5. Use Pivot1 activity to switch from columns to rows, enter expression concat('T|',toString(max(Row_No),'00')) to get T|03. enter image description here Pivot1 activity data preview is as follows: enter image description here

  6. The settings of source2 is the same as source1. enter image description here

  7. At DerivedColumn1,
    set column name: Column1 ,
    set expression: concat(Column_1,'|',toString(currentTimestamp())). enter image description here

  8. At SurrogateKey2 activity, I key in Row_No as Key column and 2 as Start value. enter image description here SurrogateKey2 activity data preview is as follows: enter image description here

  9. At Select2 activity, filter the column which we want and give this column an alias. enter image description here
    Data preview is as follows: enter image description here

  10. headers stores the header info in a csv file. Set Column_1 as column name. enter image description here

  11. At SurrogateKey3 activity, I key in Row_No as Key column and 1 as Start value. enter image description here

  12. Union SurrogateKey3 activity with Select2 activity. enter image description here It will sort by Row_No column, so the title will be on the first line. enter image description here

  13. Then we can only select what we need via Select1 activity. enter image description here Select1 activity data preview is as follows: enter image description here

  14. Union Pivot1 activity and Select1 activity via Union2 activity. enter image description here The Union2 activity data preview is as follows: enter image description here

  15. After run debug, final csv file is as follows:
    enter image description here

Upvotes: 2

Related Questions