Reputation: 35
I am new to Python and I am trying to convert the following JSON into a panda frame.
The format of json is as follows. I have reduced the columns and rows. There are around 8 columns and each json has around 20000 rows
{
"DataFeed":[
{
"Columns":[
{
"Name":"customerID",
"Category":"Dimension",
"Type":"String"
},
{
"Name":"InvoiceID",
"Category":"Dimension",
"Type":"String"
},
{
"Name":"storeloc",
"Category":"Dimension",
"Type":"String"
}
],
"Rows":[
{
"customerID":"id128404805",
"InvoiceID":"IN3956",
"storeloc":"TX359"
},
{
"customerID":"id128404806",
"InvoiceID":"IN0054",
"storeloc":"CA235"
},
{
"customerID":"id128404807",
"InvoiceID":"IN7439",
"storeloc":"AZ2309"
}
]
}
]
}
i am trying to load it into a pandas dataframe. The number of columns are the same in json file. The number of rows are around 10000.
I am trying to get into the rows and insert into a table after certain calculations.
I am trying to use json_normalize but I am struggling with navigating to the Rows level and normalizing after that. I know it must be an issue solution but I am new to working with Json. Thanks
Upvotes: 0
Views: 238
Reputation: 23099
try pd.json_normalize()
with the record_path
argument.
Note, you'll need pandas 0.25 or higher.
assuming your json object is j
df = pd.json_normalize(j,record_path=['DataFeed','Rows'])
print(df)
customerID InvoiceID storeloc
0 id128404805 IN3956 TX359
1 id128404806 IN0054 CA235
2 id128404807 IN7439 AZ2309
Upvotes: 3