Reputation: 3971
I am trying to read in the JSON structure below into pandas dataframe, but it throws out the error message:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
Json data:
{
"status": {
"statuscode": 200,
"statusmessage": "Everything OK"
},
"result": [{
"id": 22,
"club_id": 16182
}, {
"id": 23,
"club_id": 16182
}, {
"id": 24,
"club_id": 16182
}, {
"id": 25,
"club_id": 16182
}, {
"id": 26,
"club_id": 16182
}, {
"id": 27,
"club_id": 16182
}]
}
How do I get this right? I have tried the script below...
j_df = pd.read_json('json_file.json')
j_df
with open(j_file) as jsonfile:
data = json.load(jsonfile)
Upvotes: 51
Views: 133920
Reputation: 707
I guess pandas dataframe wants it to be a list at the root level. So just put the data into a list:
with open(j_file) as jsonfile:
data = json.load(jsonfile)
df = pd.DataFrame([data])
Upvotes: 5
Reputation: 5084
The json_normalize
is a valid approach - but in my usecase I had to keep both: original dicts and arrays in the dataframe and I used that approach:
input_df = pd.read_json(input_path, lines=True, orient="columns")
The catch is that the file should be in the new line delimited json format (one json document per line in file), so it should look like this (I used jq -c
to convert it to one line):
{"status":{"statuscode":200,"statusmessage":"Everything OK"},"result":[{"id":22,"club_id":16182},{"id":23,"club_id":16182},{"id":24,"club_id":16182},{"id":25,"club_id":16182},{"id":26,"club_id":16182},{"id":27,"club_id":16182}]}
Upvotes: 1
Reputation: 97
If you need only the result part in the data frame, here is the code to help you:
import json
import pandas as pd
data = json.load(open('json_file.json'))
df = pd.DataFrame(data["result"])
To the best of my knowledge, the ValueError occurs because the data types are all over the place, some strings, some lists, multiple {} etc. This error may be solved by normalizing the data. To do that, here is the code below:
import json
with open('json_file.json') as project_file:
data = json.load(project_file)
df = pd.json_normalize(data)
Upvotes: 8
Reputation: 862731
You can use json_normalize
with assign
:
from pandas.io.json import json_normalize
import json
with open('json_file.json') as data_file:
d= json.load(data_file)
df = json_normalize(d, 'result').assign(**d['status'])
print (df)
club_id id statuscode statusmessage
0 16182 22 200 Everything OK
1 16182 23 200 Everything OK
2 16182 24 200 Everything OK
3 16182 25 200 Everything OK
4 16182 26 200 Everything OK
5 16182 27 200 Everything OK
Upvotes: 17
Reputation: 1281
If you just need the result part in a dataframe, then here is the code to help you.
import json
import pandas as pd
data = json.load(open('json_file.json'))
df = pd.DataFrame(data["result"])
Upvotes: 65