Reputation: 45
I am attempting to parse this JSON and convert it to Excel and only print out certain columns using Python but I keep getting an TypeError: string indices must be integers
. The JSON file continues to repeat what is inside "d"
over and over again.
JSON:
{
"d": [
{
"__type": "New Cust",
"Description": "TV Purchase",
"End": "/Date(1624962600000)/",
"ID": 1,
"Resources": [
{
"Available": true,
"Key": "12345",
"Text": "John Doe",
"Type": "SalesProvider",
"CalendarID": "1234"
}
],
"Subject": "Meeting with Sam",
"Visible": true,
"AppStatus": "5",
"StartTime": "06/01/2021 10:30:00",
"AppID": "1",
"ServiceID": "7",
"NotesCount": "0",
"CustomerID": "1",
"AppType": "NNR",
"IsEditedThis": "False",
"BusinessPackageID": "0",
"PopupNotesCount": "0",
"EventType": "1",
"SPName": "John Doe",
"SPID": "12345",
"EventCapacity": "0",
"IsPaid": "False",
"IsWorkShop": "False",
"FormStatus": "0",
"PaidType": "1",
"AppComment": "",
"CustName": "Sam Will",
"ResourceID": "",
"CssClass": "rsCategoryBlue",
"ServiceName": "TV Sale",
"NoOfAttendees": null,
"StreamingNoOfAttendees": null,
"StreamingStatus": "0",
"StreamingEventCapacity": "",
"Photo": "",
"PersonalOffType": null,
"ResourceName": null,
"IsShowCheckIn": false,
"PaymentStatus": 0
},
{
"__type": "New Cust",
"Description": "Receiver Purchase",
"End": "/Date(1624962600000)/",
"ID": 1,
"Resources": [
{
"Available": true,
"Key": "12345",
"Text": "John Doe",
"Type": "SalesProvider",
"CalendarID": "1234"
}
],
"Subject": "Meeting with Bill",
"Visible": true,
"AppStatus": "5",
"StartTime": "07/02/2021 9:30:00",
"AppID": "1",
"ServiceID": "7",
"NotesCount": "0",
"CustomerID": "1",
"AppType": "NNR",
"IsEditedThis": "False",
"BusinessPackageID": "0",
"PopupNotesCount": "0",
"EventType": "1",
"SPName": "John Doe",
"SPID": "12345",
"EventCapacity": "0",
"IsPaid": "False",
"IsWorkShop": "False",
"FormStatus": "0",
"PaidType": "1",
"AppComment": "",
"CustName": "Bill Tom",
"ResourceID": "",
"CssClass": "rsCategoryBlue",
"ServiceName": "Audio Sale",
"NoOfAttendees": null,
"StreamingNoOfAttendees": null,
"StreamingStatus": "0",
"StreamingEventCapacity": "",
"Photo": "",
"PersonalOffType": null,
"ResourceName": null,
"IsShowCheckIn": false,
"PaymentStatus": 0
}
]
}
Python Code:
import json
import pandas as pd
f = open('JSON.txt', 'r')
data = json.loads(f.read())
l = []
for profile in data['d']:
l.append(profile["Subject"]["StartTime"]["IsPaid"]["CustName"]["ServiceName"])
df1 = pd.DataFrame(l)
print(df1)
df1.to_excel('df1.xlsx')
I do not need the "Resources": []
info I just need certain parameters outside it in the JSON object. I am having difficulty parsing the JSON any help would be appreciated.
Upvotes: 0
Views: 614
Reputation: 23099
you can use a combination of the json
standard library and pd.json_normalize
import json
import pandas as pd
parsed_json = json.loads('JSON.txt')
df = pd.json_normalize(parsed_json['d'],record_path='Resources')
print(df)
Available Key Text Type CalendarID
0 True 12345 John Doe SalesProvider 1234
1 True 12345 John Doe SalesProvider 1234
then pass it to excel
df.to_excel(...,index=False)
Going back to your issues, it seems like your trying to grab a bunch of fields like a list, when in reality you're attempting to find an attribute from a single data type object.
print(data['d'][0]['Subject'])
'Meeting with Sam'
This has no other nested items so you'll naturally get an error.
the error TypeError: string indices must be integers
is telling you you can only slice this object with integers i.e
#reversing the string.
print(data['d'][0]['Subject'][::-1])
maS htiw gniteeM
or
#first two characters of the string.
print(data['d'][0]['Subject'][:2])
Me
if you want to grab only a subset of columns from the top level you could do :
cols = ['Subject', 'StartTime', 'IsPaid', 'CustName', 'ServiceName']
df = pd.json_normalize(parsed_json['d'],)[cols]
print(df)
Subject StartTime IsPaid CustName ServiceName
0 Meeting with Sam 06/01/2021 10:30:00 False Sam Will TV Sale
1 Meeting with Bill 07/02/2021 9:30:00 False Bill Tom Audio Sale
Upvotes: 1