Reputation: 164
Relatively new to Python here, coming from a node.js background, having quite a few issues parsing the output I get from get_query_results()
I have been at this for some hours, i have tried iterating through the ['ResultSetMetadata']['ColumnInfo']
to grab the column names, but i don't know how to tie the ['ResultSet']['Data']
to these items so the code knows which name to apply to each dataValue
.
I know i need to select the row headers then add the associated objects to those rows, but the logic on how to do such a thing in python escapes me.
I can see that the first column name always lines up with the first ['Data']['VarCharValue']
so I can get all the values in order, but if I loop through ['ResultSet']['Rows']
how do I isolate the first iteration as the column names to then populate with each other row?
Or is there a better way to do this?
Here is my json.dumps(ATHENAoutput)
{
"ResultSet": {
"Rows": [{
"Data": [{
"VarCharValue": "postcode"
}, {
"VarCharValue": "CountOf"
}]
}, {
"Data": [{
"VarCharValue": "1231"
}, {
"VarCharValue": "2"
}]
}, {
"Data": [{
"VarCharValue": "1166"
}, {
"VarCharValue": "2"
}]
}, {
"Data": [{
"VarCharValue": "3651"
}, {
"VarCharValue": "3"
}]
}, {
"Data": [{
"VarCharValue": "2171"
}, {
"VarCharValue": "2"
}]
}, {
"Data": [{
"VarCharValue": "4697"
}, {
"VarCharValue": "2"
}]
}, {
"Data": [{
"VarCharValue": "4450"
}, {
"VarCharValue": "2"
}]
}, {
"Data": [{
"VarCharValue": "4469"
}, {
"VarCharValue": "1"
}]
}],
"ResultSetMetadata": {
"ColumnInfo": [{
"Scale": 0,
"Name": "postcode",
"Nullable": "UNKNOWN",
"TableName": "",
"Precision": 2147483647,
"Label": "postcode",
"CaseSensitive": true,
"SchemaName": "",
"Type": "varchar",
"CatalogName": "hive"
}, {
"Scale": 0,
"Name": "CountOf",
"Nullable": "UNKNOWN",
"TableName": "",
"Precision": 19,
"Label": "CountOf",
"CaseSensitive": false,
"SchemaName": "",
"Type": "bigint",
"CatalogName": "hive"
}]
}
},
"ResponseMetadata": {
"RetryAttempts": 0,
"HTTPStatusCode": 200,
"RequestId": "18190e7c-901c-40b4-b6ef-10a5013b1a70",
"HTTPHeaders": {
"date": "Mon, 01 Oct 2018 04:51:14 GMT",
"x-amzn-requestid": "18190e7c-901c-40b4-b6ef-10a5013b1a70",
"content-length": "1464",
"content-type": "application/x-amz-json-1.1",
"connection": "keep-alive"
}
}
}
My desired Result is a JSON Array like the following:
[{
"postcode": "2171",
"CountOf": "2"
}, {
"postcode": "4697",
"CountOf": "2"
}, {
"postcode": "1166",
"CountOf": "2"
},
...
]
Upvotes: 1
Views: 4469
Reputation: 1
A little late, but I ran into the same problem today.
Here was my solution:
headers = [header["VarCharValue"] for header in results["ResultSet"]["Rows"][0]["Data"]]
parsed_results = [
{
headers[i]: results["ResultSet"]["Rows"][j]["Data"][i]["VarCharValue"]
for i in range(0, len(headers))
}
for j in range(1, len(results["ResultSet"]["Rows"]))
]
Upvotes: 0
Reputation: 3382
>>> def get_var_char_values(d):
... return [obj['VarCharValue'] for obj in d['Data']]
...
...
... header, *rows = input_data['ResultSet']['Rows']
... header = get_var_char_values(header)
... result = [dict(zip(header, get_var_char_values(row))) for row in rows]
>>> import json; print(json.dumps(result, indent=2))
[
{
"postcode": "4450",
"CountOf": "2"
},
{
"postcode": "1231",
"CountOf": "2"
},
{
"postcode": "4469",
"CountOf": "1"
},
{
"postcode": "3651",
"CountOf": "3"
},
{
"postcode": "1166",
"CountOf": "2"
},
{
"postcode": "4697",
"CountOf": "2"
},
{
"postcode": "2171",
"CountOf": "2"
}
]
Upvotes: 6