Reputation: 2591
A conventional CSV file will contain rows that all have the same columns, and an optional a header row, however in this situation I need to process a CSV that seems non-standard in that it has a "header" row that contains 5 columns, followed by an unspecified number of "body" rows that contain about 15 columns each (unrelated to the header row), and ends with a "footer" row that contains 4 columns.
It looks like the entire file is representing an object, something like this:
headerValue1,headerValue2,headerValue3,headerValue4,headerValue5
bodyvalue1,bodyvalue2,bodyvalue3,bodyvalue4,bodyvalue5,bodyvalue6,bodyvalue7,bodyvalue8,bodyvalue9
bodyvalue1,bodyvalue2,bodyvalue3,bodyvalue4,bodyvalue5,bodyvalue6,bodyvalue7,bodyvalue8,bodyvalue9
footervalue1,footervalue2,footervalue3,footervalue4
I need to convert it to JSON format and so have been trying to set the CSV values to an Array of objects using a for-loop but have had no luck (no useful code to post). Both the CSV formats with header and without in mule don't seem to work as it seems to expect that each column will contain the same field type, which they don't.
Also tried to define a flat-file schema for it but had no luck either (could be due to my limited knowledge in the area).
So my question is how to correctly, or efficiently, pass this CSV data into a usable array or object, or perhaps even straight into JSON format?
Upvotes: 0
Views: 2463
Reputation: 1151
If I understand correctly, your file is a flat file structure consist of header, data/body, footer, where each record / segment is a delimited record. Mule's flat file reader will not work as it can only support fixed-width records. What you can do is to separate each group of records into string and then read each record as csv. I can do that in two transform step, Transform 1 - to read the file and parse the payload as array of strings, Transform 2 - to transform the array of strings to proper json format. Note here that there might performance issue (if your flat file is too big) as it will hold the payload as string in memory.
Flow:
Transform 1:
%dw 2.0
output application/java
---
payload splitBy "\r\n"
Transform 2:
%dw 2.0
output application/json
var sizePayload = sizeOf(payload)
---
{
header: read(payload[0], "application/csv", {"header" : false})[0],
body: read(payload[1 to sizePayload-2] joinBy "\n", "application/csv", {"header" : false}),
footer: read(payload[sizePayload-1], "application/csv", {"header" : false})[0],
}
Given a sample file below:
headerValue1,headerValue2,headerValue3,headerValue4,headerValue5
bodyvalue1,bodyvalue2,bodyvalue3,bodyvalue4,bodyvalue5,bodyvalue6,bodyvalue7,bodyvalue8,bodyvalue9
bodyvalue12ndRow,bodyvalue2,bodyvalue3,bodyvalue4,bodyvalue5,bodyvalue6,bodyvalue7,bodyvalue8,bodyvalue92ndRow
footervalue1,footervalue2,footervalue3,footervalue4
This will result to below json:
{
"header": {
"column_0": "headerValue1",
"column_1": "headerValue2",
"column_2": "headerValue3",
"column_3": "headerValue4",
"column_4": "headerValue5"
},
"body": [
{
"column_0": "bodyvalue1",
"column_1": "bodyvalue2",
"column_2": "bodyvalue3",
"column_3": "bodyvalue4",
"column_4": "bodyvalue5",
"column_5": "bodyvalue6",
"column_6": "bodyvalue7",
"column_7": "bodyvalue8",
"column_8": "bodyvalue9"
},
{
"column_0": "bodyvalue12ndRow",
"column_1": "bodyvalue2",
"column_2": "bodyvalue3",
"column_3": "bodyvalue4",
"column_4": "bodyvalue5",
"column_5": "bodyvalue6",
"column_6": "bodyvalue7",
"column_7": "bodyvalue8",
"column_8": "bodyvalue92ndRow"
}
],
"footer": {
"column_0": "footervalue1",
"column_1": "footervalue2",
"column_2": "footervalue3",
"column_3": "footervalue4"
}
}
Upvotes: 0
Reputation: 25699
DataWeave handles this case. Unnamed columns get a generic column name ('columns_N'). For an example see the documentation. A direct payload convertion (ie 'payload') works.
DataWeave example:
... set the csv into the payload and be sure its MIME type is to application/csv...
<ee:transform doc:name="Transform Message">
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/json
---
payload]]>
</ee:set-payload>
</ee:message>
</ee:transform>
Output:
[
{
"headerValue1": "bodyvalue1",
"headerValue2": "bodyvalue2",
"headerValue3": "bodyvalue3",
"headerValue4": "bodyvalue4",
"headerValue5": "bodyvalue5",
"column_5": "bodyvalue6",
"column_6": "bodyvalue7",
"column_7": "bodyvalue8",
"column_8": "bodyvalue9"
},
{
"headerValue1": "bodyvalue1",
"headerValue2": "bodyvalue2",
"headerValue3": "bodyvalue3",
"headerValue4": "bodyvalue4",
"headerValue5": "bodyvalue5",
"column_5": "bodyvalue6",
"column_6": "bodyvalue7",
"column_7": "bodyvalue8",
"column_8": "bodyvalue9"
},
{
"headerValue1": "footervalue1",
"headerValue2": "footervalue2",
"headerValue3": "footervalue3",
"headerValue4": "footervalue4"
}
]
Upvotes: 1