Reputation: 32304
This statement reads the json file. But it does not split the columns correctly.
df = pd.read_json('https://s3.amazonaws.com/todel162/config1.json', orient='index')
Is there any way to read the json using pandas dataframe?
Upvotes: 1
Views: 63
Reputation: 431
You can try this.
import json
import urllib.request as req
import pandas as pd
with req.urlopen("https://s3.amazonaws.com/todel162/config1.json") as j:
raw = json.loads(j.read().decode())
df = pd.DataFrame(raw["configurationItems"])
df["fileVersion"] = raw["fileVersion"]
print(df)
Upvotes: 1
Reputation: 863431
You can use json.json_normalize
:
import json
from pandas.io.json import json_normalize
with open('config1.json') as f:
data = json.load(f)
df = json_normalize(data, 'configurationItems', ['fileVersion'])
print (df)
ARN awsAccountId awsRegion \
0 arn:aws:cloudtrail:us-east-1:513469704633:trai... 513469704633 us-east-1
1 arn:aws:cloudtrail:us-east-1:513469704633:trai... 513469704633 us-east-1
configurationItemCaptureTime configurationItemStatus \
0 2018-07-27T11:52:53.795Z ResourceDeleted
1 2018-07-27T11:52:53.791Z ResourceDeleted
configurationItemVersion configurationStateId configurationStateMd5Hash \
0 1.3 1532692373795
1 1.3 1532692373791
relatedEvents relationships resourceId \
0 [] [] AWSMacieTrail-DO-NOT-EDIT
1 [] [] test01
resourceType supplementaryConfiguration tags fileVersion
0 AWS::CloudTrail::Trail {} {} 1.0
1 AWS::CloudTrail::Trail {} {} 1.0
Upvotes: 1