Reputation: 1459
I currently read a JSON data structure from an API. This structure is kind of nested:
my_json={
"data": {
"allSites": {
"activeLicenses": 0,
"totalLicenses": 1100
},
"sites": [
{
"accountId": "12345",
"accountName": "ACME INC",
"activeLicenses": 0,
"createdAt": "2021-01-12T20:04:12.166693Z",
"creator": null,
"creatorId": null,
"expiration": null,
"externalId": null,
"healthStatus": true,
"id": "12345",
"isDefault": true,
"name": "Default site",
"registrationToken": "rznwzrsnbwrn==",
"siteType": "Paid",
"sku": "Core",
"state": "active",
"suite": "Core",
"totalLicenses": 0,
"unlimitedExpiration": true,
"unlimitedLicenses": true,
"updatedAt": "2021-01-12T20:04:12.165504Z"
},
{
"accountId": "67890",
"accountName": "DE | C12345 | ACME Inc",
"activeLicenses": 0,
"createdAt": "2021-01-15T12:53:05.363922Z",
"creator": "John Doe",
"creatorId": "567837837",
"expiration": "2021-01-31T02:00:00Z",
"externalId": "C12345",
"healthStatus": true,
"id": "3268726578",
"isDefault": true,
"name": "Default site",
"registrationToken": "dghmntzme6umeum==",
"siteType": "Paid",
"sku": "Core",
"state": "active",
"suite": "Core",
"totalLicenses": 1000,
"unlimitedExpiration": false,
"unlimitedLicenses": false,
"updatedAt": "2021-01-15T12:53:05.878138Z"
},
{
"accountId": "769i376586256",
"accountName": "ACME Inc 2",
"activeLicenses": 0,
"createdAt": "2021-01-16T10:48:55.629903Z",
"creator": "Jon Doe",
"creatorId": "267267",
"expiration": null,
"externalId": null,
"healthStatus": true,
"id": "467267267",
"isDefault": false,
"name": "IT PoC",
"registrationToken": "sthmetuzmstmwsu==",
"siteType": "Trial",
"sku": "Complete",
"state": "active",
"suite": "Complete",
"totalLicenses": 100,
"unlimitedExpiration": true,
"unlimitedLicenses": false,
"updatedAt": "2021-01-16T10:48:55.940332Z"
}
]
},
"pagination": {
"nextCursor": null,
"totalItems": 3
}
}
I'm only interested in the elements within "sites", everything else can be ignored for the time being. In the corresponding XLS the Column Names should be the item names like accoutID, accountName, etc.
I'm able to get everything exported to CSV or XLSX with Pandas, but how to I limit the export to "sites" only and get the column headers?
thanks
---edit while trying below suggestions, I get keyword errors.
My code looks like:
r = requests.get("https://URL/web/api/v2.1/sites?limit=999&sortBy=name&states=active", headers={"Authorization":token})
data = r.json()
print(data)
sites = data['allSites']['sites']
result = pd.DataFrame(sites)
print(result)
the result of "data" the content mentioned above.
when I now do
sites = data['allSites']['sites']
result = pd.DataFrame(sites)
print(result)
I get back an error:
Traceback (most recent call last): File "", line 1, in File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/Users/adieball/Dropbox/Multiverse/Programming/S1MSSPProvisioning/getStats.py", line 45, in main() File "/Users/adieball/Dropbox/Multiverse/Programming/S1MSSPProvisioning/getStats.py", line 29, in main sites = data['allSites']['sites'] KeyError: 'allSites'
Upvotes: 0
Views: 278
Reputation: 10624
You can get the desired data by using the following:
sites= my_json['data']['sites']
and create a dataframe with:
res=pd.DataFrame(sites)
Output will be:
>>>print(res)
accountId accountName ... unlimitedLicenses updatedAt
0 12345 ACME INC ... True 2021-01-12T20:04:12.165504Z
1 67890 DE | C12345 | ACME Inc ... False 2021-01-15T12:53:05.878138Z
2 769i376586256 ACME Inc 2 ... False 2021-01-16T10:48:55.940332Z
[3 rows x 21 columns]
Upvotes: 1