Reputation: 11881
I have json that looks like this:
{
"formatVersion" : "v1.0",
"disclaimer" : "This pricing list is for informational purposes only ..."
"offerCode" : "AmazonEC2",
"version" : "20181122020351",
"publicationDate" : "2018-11-22T02:03:51Z",
"products" : {
"G5FFNNK98ETA2UBE" : {
"sku" : "G5FFNNK98ETA2UBE",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "Asia Pacific (Tokyo)",
"locationType" : "AWS Region",
"instanceType" : "c4.4xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Compute optimized",
"vcpu" : "16",
"physicalProcessor" : "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed" : "2.9 GHz",
"memory" : "30 GiB",
"storage" : "EBS only",
and I'm trying to convert it to a Pandas DataFrame using this code:
df = pd.DataFrame()
for sku, data in json.loads(ec2offer)['products'].items():
if data['productFamily'] == 'Compute Instance':
new_df = pd.DataFrame.from_dict(data['attributes'], index=[0])
df.append(new_df, ignore_index=True)
print(df)
Before adding index=[0]
, I was getting the error “ValueError: If using all scalar values, you must pass an index”
So I added that based on the answer to Constructing pandas DataFrame from values in variables gives "ValueError: If using all scalar values, you must pass an index"
Now I'm getting this error instead:
TypeError: from_dict() got an unexpected keyword argument 'index'
TL;DR
Forget about the above code. What's the simplest way to add each 'attributes' structure from the above json into its own row in a Pandas dataframe?
EXPECTED OUTPUT
instanceType memory ...
c4.4xlarge 30 Gib ...
... ... ...
Upvotes: 0
Views: 4961
Reputation: 4638
jsonstr={
"formatVersion": "v1.0",
"disclaimer": "This pricing list is for informational purposes only ...",
"offerCode": "AmazonEC2",
"version": "20181122020351",
"publicationDate": "2018-11-22T02:03:51Z",
"products": {
"G5FFNNK98ETA2UBE": {
"sku": "G5FFNNK98ETA2UBE",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonEC2",
"location": "Asia Pacific (Tokyo)",
"locationType": "AWS Region",
"instanceType": "c4.4xlarge",
"currentGeneration": "Yes",
"instanceFamily": "Compute optimized",
"vcpu": "16",
"physicalProcessor": "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed": "2.9 GHz",
"memory": "30 GiB",
"storage": "EBS only"
}
},
"G5FFNNK98ETA2VIB": {
"sku": "G5FFNNK98ETA2UBE",
"productFamily": "Compute Instance",
"attributes": {
"servicecode": "AmazonEC22",
"location": "Asia Pacific (Tokyo)",
"locationType": "AWS Region",
"instanceType": "c4.4xlarge",
"currentGeneration": "Yes",
"instanceFamily": "Compute optimized",
"vcpu": "16",
"physicalProcessor": "Intel Xeon E5-2666 v3 (Haswell)",
"clockSpeed": "2.9 GHz",
"memory": "30 GiB",
"storage": "EBS only"
}
}
}
}
import pandas as pd
d={}
for product in jsonstr['products'].keys():
d[product]={}
d[product]=jsonstr['products'][product]['attributes']
df=pd.DataFrame(d).T.reset_index().drop('index',1)
Output:
df
Upvotes: 1