Reputation: 45
I'm facing a pretty interesting task to convert an arbitrary CSV file to a JSON structure following this schema:
{
"Data": [
["value_1", "value_2"],
["value_3", "value_4"]
]
}
In this case, the input file will look like this:
value_1,value_2
value_3,value_4
The requirement is to use Azure Data Factory and I won't be able to delegate this task to Azure Functions or other services.
I'm thinking about using 'Copy data' activity but can't get my mind around the configuration. TabularTranslator
seems to only work with a definite number of columns but the CSV that I can receive can contain any number of columns.
Maybe DataFlows can help me but their setup doesn't look to be an easy one either. Plus, if I get it correctly, DataFlows take more time to start up.
So, basically, I just need to take the CSV content and put it into "Data" 2d array.
Any ideas on how to accomplish this?
Upvotes: 0
Views: 700
Reputation: 6104
To achieve this requirement, using Copy data
or TabularTranslator
is complicated. This can be achieved using dataflows in the following way.
data
. The following is how the data preview looks like:split
function in derived column
transformations. I am replacing the same column using split(data,',')
.key
column with a constant value 'x' so that I can group all rows and covert the grouped data into array of arrays.aggregate
transformation to group by the above created column and use collect
aggregate function to create array of arrays (collect(data)
).select
transformation to select only the above created column Data
.output to single file
in settings and give a file name.Upvotes: 3