Reputation: 45
I have a problem with Talend: I have to extract a very strange JSON format, it looks like:
{"results":[{"id":0,"series":[{"name":"table1","columns":["column1","column2","column3","column4"],"values":[["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"]]}]}]}
in practice we have that inside the "series" object we have the "columns" object with the name of the various columns and the "values" object with the values of the various rows. The desired output would be a table/csv/json that has a more normal format, so field and value. Does anyone know how I could do this? So far I have tried extracting the various JSON fields but the output is as follows:
Columns
Column1
Column2
Column3
Column4
values
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
(For this one probably I have to extract another JSON field, I thought).
Thanks to all
PS. I added Talend in the post
Upvotes: 0
Views: 514
Reputation: 4051
Here's a solution to get the results as a csv file.
I used tFixedFlowInput_1 and tFixedFlowInput_3 as an input with the json from your example.
tExtractJSONFields_1 extracts the individual columns from the columns array, then it's denormalized into a file.
tExtractJSONFields_2 extracts the values as arrays, then for each one, we extract the individual values using tExtractJSONFields_3, and we denormalize each set of values to get a csv row in tFileOutputDelimited_3 (writing to the previous file in append mode).
The final result looks like this:
column1,column2,column3,column4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
I used the comma as a separator, can be changed in tDenormalize_1 and tDenormalize_2
Upvotes: 2
Reputation: 21463
You didn't specify any language, so I guess any language is fair game? This PHP script
<?php
$js=<<<'JS'
{
"results": [{
"id": 0,
"series": [{
"name": "table1",
"columns": ["column1", "column2", "column3", "column4"],
"values": [
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"]
]
}]
}]
}
JS;
$data=json_decode($js,true);
$extracted=array();
foreach($data['results'] as $result){
foreach($result['series'] as $serie){
foreach($serie['values'] as $values){
$extract=[];
foreach($values as $valueKey=>$value){
$extract[$serie["columns"][$valueKey]]=$value;
}
$extracted[]=$extract;
}
}
}
echo json_encode($extracted,JSON_PRETTY_PRINT);
outputs
[
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
}
]
Upvotes: 0