userPyGeo
userPyGeo

Reputation: 3971

Read JSON to pandas dataframe - ValueError: Mixing dicts with non-Series may lead to ambiguous ordering

I am trying to read in the JSON structure below into pandas dataframe, but it throws out the error message:

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

Json data:

{
    "status": {
        "statuscode": 200,
        "statusmessage": "Everything OK"
    },

    "result": [{
        "id": 22,
        "club_id": 16182
    }, {
        "id": 23,
        "club_id": 16182
    }, {
        "id": 24,
        "club_id": 16182
    }, {
        "id": 25,
        "club_id": 16182
    }, {
        "id": 26,
        "club_id": 16182
    }, {
        "id": 27,
        "club_id": 16182
    }]
}

How do I get this right? I have tried the script below...

j_df = pd.read_json('json_file.json')
j_df

with open(j_file) as jsonfile:
    data = json.load(jsonfile)

Upvotes: 51

Views: 133920

Answers (5)

Ekrem Solmaz
Ekrem Solmaz

Reputation: 707

I guess pandas dataframe wants it to be a list at the root level. So just put the data into a list:

with open(j_file) as jsonfile:
    data = json.load(jsonfile)
    df = pd.DataFrame([data])

Upvotes: 5

botchniaque
botchniaque

Reputation: 5084

The json_normalize is a valid approach - but in my usecase I had to keep both: original dicts and arrays in the dataframe and I used that approach:

input_df = pd.read_json(input_path, lines=True, orient="columns")

The catch is that the file should be in the new line delimited json format (one json document per line in file), so it should look like this (I used jq -c to convert it to one line):

{"status":{"statuscode":200,"statusmessage":"Everything OK"},"result":[{"id":22,"club_id":16182},{"id":23,"club_id":16182},{"id":24,"club_id":16182},{"id":25,"club_id":16182},{"id":26,"club_id":16182},{"id":27,"club_id":16182}]}

Upvotes: 1

light1203
light1203

Reputation: 97

If you need only the result part in the data frame, here is the code to help you:

import json
import pandas as pd
data = json.load(open('json_file.json'))

df = pd.DataFrame(data["result"])

To the best of my knowledge, the ValueError occurs because the data types are all over the place, some strings, some lists, multiple {} etc. This error may be solved by normalizing the data. To do that, here is the code below:

import json

with open('json_file.json') as project_file:    
    data = json.load(project_file)  

df = pd.json_normalize(data)

Upvotes: 8

jezrael
jezrael

Reputation: 862731

You can use json_normalize with assign:

from pandas.io.json import json_normalize
import json

with open('json_file.json') as data_file:    
    d= json.load(data_file)  

df = json_normalize(d, 'result').assign(**d['status'])
print (df)
   club_id  id  statuscode  statusmessage
0    16182  22         200  Everything OK
1    16182  23         200  Everything OK
2    16182  24         200  Everything OK
3    16182  25         200  Everything OK
4    16182  26         200  Everything OK
5    16182  27         200  Everything OK

Upvotes: 17

Rao Sahab
Rao Sahab

Reputation: 1281

If you just need the result part in a dataframe, then here is the code to help you.

import json
import pandas as pd
data = json.load(open('json_file.json'))

df = pd.DataFrame(data["result"])

Upvotes: 65

Related Questions