Amit
Amit

Reputation: 1

ADF load data from CSV in sequence present in file

My CSV files data like

100,data1,data2,data3

200,data1,data2,data3

300,data1,data2,data3

300,data1,data2,data5

200,data1,data2,data3

300,data1,data2,data3

When i load the data into my table using ADF dataflow it change the sequence of row. I want data to be loaded into the table as same row sequence present into the file like; I always want the rows be 100 then 200 and 300 belong to that and then next set of 200 and 300 belong. It works for small data files but for large data files it does not maintain the row sequence. Looks like some kind data buffer and bulk load trigger it, is their a way i can force load rows one at a time as it present in file.

Upvotes: 0

Views: 492

Answers (1)

Mark Kromer MSFT
Mark Kromer MSFT

Reputation: 3838

You should use a Sort transformation, sorting the first column as the sort column. Also set the number of partitions in the Sort transformation under Optimize to "single partition". You need to do this because Mapping Data Flows is a scale-out feature built on Spark that will distribute data across nodes and partitions, losing your macro sort order, unless you force it into a single partition.

Upvotes: 1

Related Questions