adimohankv
adimohankv

Reputation: 281

ValueError: Expected object or value when reading json as pandas dataframe

Sample data:

{
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
}

{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}

I tried:

import pandas as pd
data= pd.read_json('Data.json')

getting error ValueError: Expected object or value

also

import json
with open('gdb.json') as datafile:
    data = json.load(datafile)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 509)

with open('gdb.json') as datafile:
for line in datafile:
    data = json.loads(line)
retail = pd.DataFrame(data)

error: json.decoder.JSONDecodeError: Extra data: line 1 column 577 (char 576)

How to read this json into pandas

Upvotes: 28

Views: 153201

Answers (18)

Nephi Balinski
Nephi Balinski

Reputation: 13

every thing is ok except for one thing

You can't have two separate dictionaries in one json file. You need a containing dictionary or list.

In the .json file put the code below:

[
    {
        "_id": "OzE5vaa3p7",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "nebCwWd2Fr"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
        "barcode": "8908001921015",
        "isFmcg": true,
        "itemName": "Anil puttu flour 500g",
        "mrp": 58,
        "_created_at": "2016-10-02T13:49:03.281Z",
        "_updated_at": "2017-02-22T08:48:09.548Z"
    },

    {
        "_id": "ENPCL8ph1p",
        "categories": [
        {
            "__type": "Pointer",
            "className": "Category",
            "objectId": "B4nZeUHmVK"
        }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
        "barcode": "8901725181222",
        "isFmcg": true,
        "itemName": "Yippee Magic Masala Noodles, 70 G",
        "mrp": 12,
        "_created_at": "2016-10-02T13:49:03.284Z",
        "_updated_at": "2017-02-22T08:48:09.074Z"
    }
]

Now you can access the data like this:

import pandas as pd
data = pd.read_json('Data.json')

# To get the first dictionary
dict1 = data[0]

# To get the second dictionary
dict2 = data[1]

Thank you!

Upvotes: 0

rahul
rahul

Reputation: 1243

I got the same error, read the function documentation and play around with different parameters.

I solved it by using the one below,

data= pd.read_json('Data.json', lines=True)

you can try out other things like

data= pd.read_json('Data.json', lines=True, orient='records')

data= pd.read_json('Data.json', orient=str)

Upvotes: 18

Aayush Shah
Aayush Shah

Reputation: 520

See many times the JSON is in the following format (for those who are still searching for the solution):

🤔 Problem


{col1:'val1', col2:'val2'}{col1:'val1', col2:'val2'}{col1:'val1', col2:'val2'}

🖖🏻 As you can see we have three issues here:

  1. Keys don't have the double quotes
  2. Values which have quotes but are single
  3. The records are not seperated by comma and return

😅 We will need to replace three things


0. Add the square brackets if not already Add them [ and ] in the beginning of JSON and at the end. Which is just the matter of pressing Home and End keys on your keyboard 😊

1. Replace single quotes with double

import re
# either this (simple)
p = re.compile('(?<!\\\\)\'')

# or this - takes care of quotes in the values
p = re.compile("(?<=:)\s*'(.*?)'\s*(?=,|\n|})")

data = p.sub('\"', data)

Assuming the JSON data is in the string format and stored in the data variable.

2. Provide the double quotes to the keys

data = re.sub(r'(\w+)(?=:)', r'"\1"', data)

3. Give the new line for each record

data = re.sub(r'}\s*{', '},\n{', _data)

Done! Just save the file 🎉

with open("ABC.json", "w") as file:
    file.write(data)

Load in pandas 🐼

df = pd.read_json(r"./ABC.json")

We are done. We have the clean JSON like this:

[
    {"col1":"val1", "col2":"val2"},
    {"col1":"val1", "col2":"val2"},
    {"col1":"val1", "col2":"val2"}
]

Upvotes: 0

Tiago
Tiago

Reputation: 1

Seems like there's a million things that can cause this. In my case, it was that my json file started had a byte order mark, denoted by [BOM] [unix] in the vim-airline. I don't know what the byte order mark is or when it would be needed. To remove that, in vim, I ran :set nobomb and then saved the file. Then, pandas could read it and I was good to go.

Upvotes: 0

Miguel Escalante
Miguel Escalante

Reputation: 41

I just solved this problem by adding a "/" at the beggining of the absolute path.

import pandas as pd    
pd_from_json = pd.read_json("/home/miguel/folder/information.json")

Upvotes: 0

samad najafi
samad najafi

Reputation: 11

The problem of ValueError: All arrays must be of the same length that happens with

df = pd.read_json (r'./filename.json')#,lines=True)

can be solved by changing the line above to the following.

df = pd.read_json (r'./filename.json',lines=True)

Upvotes: 0

krm73
krm73

Reputation: 11

If you type in the absolute path of and use \ it should work. At least thats how I fixed the issue

Upvotes: 1

Wais Yousofi
Wais Yousofi

Reputation: 19

this worked for me: pd.read_json('./dataset/healthtemp.json', typ="series")

Upvotes: 0

John Stud
John Stud

Reputation: 1779

Another variation, combining tips from the thread that all failed independently but this worked for me:

pd.read_json('file.json', lines=True, encoding = 'utf-8-sig')

Upvotes: 1

crazysra
crazysra

Reputation: 151

I faced the same problem the reason behind this is the json file has something that doesn't abide by json rules. In my case i had used single quotes in one of the values instead of double quotes.

enter image description here

Upvotes: 4

pieterbons
pieterbons

Reputation: 1724

I encountered this error message today, and in my case the problem was that the encoding of the text file was UTF-8-BOM instead of UTF-8, which is the default for read_json(). This can be solved by specifying the encoding:

data= pd.read_json('Data.json', encoding = 'utf-8-sig')

Upvotes: 10

anilbpoyraz
anilbpoyraz

Reputation: 11

If you try the code below, it will solve the problem:

data_set = pd.read_json(r'json_file_address\file_name.json', lines=True)

Upvotes: 1

Arthur Rees
Arthur Rees

Reputation: 17

You can try to change relative path to absolute path For your situation change

import pandas as pd
data= pd.read_json('Data.json')

to

import pandas as pd
data= pd.read_json('C://Data.json')#the absolute path in explore

I got the same error when I run the same code from jupyter notebook to pycharm's jupyter notebook in console

Upvotes: 1

user12240240
user12240240

Reputation: 11

make your path easy, it will be helpful to read data. meanwhile, just put your file on your desktop and give that path to read the data. It works.

Upvotes: 1

Steven
Steven

Reputation: 2133

Your JSON is malformed.

ValueError: Expected object or value can occur if you mistyped the file name. Does Data.json exist? I noticed for your other attempts you used gdb.json.

Once you confirm the file name is correct, you have to fix your JSON. What you have now is two disconnected records separated by a space. Lists in JSON must be valid arrays inside square brackets and separated by a comma: [{record1}, {record2}, ...]

Also, for pandas you should put your array under a root element called "data":

{ "data": [ {record1}, {record2}, ... ] }

Your JSON should end up looking like this:

{"data":
    [{
        "_id": "OzE5vaa3p7",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "nebCwWd2Fr"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
        "barcode": "8908001921015",
        "isFmcg": true,
        "itemName": "Anil puttu flour 500g",
        "mrp": 58,
        "_created_at": "2016-10-02T13:49:03.281Z",
        "_updated_at": "2017-02-22T08:48:09.548Z"
    }
    ,
    {
        "_id": "ENPCL8ph1p",
        "categories": [
            {
                "__type": "Pointer",
                "className": "Category",
                "objectId": "B4nZeUHmVK"
            }
        ],
        "isActive": true,
        "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
        "barcode": "8901725181222",
        "isFmcg": true,
        "itemName": "Yippee Magic Masala Noodles, 70 G",
        "mrp": 12,
        "_created_at": "2016-10-02T13:49:03.284Z",
        "_updated_at": "2017-02-22T08:48:09.074Z"
    }]}

Finally, pandas calls this format split orientation, so you have to load it as follows:

df = pd.read_json('gdb.json', orient='split')

df now contains the following data frame:

          _id                                                   categories  isActive                                                     imageUrl        barcode  isFmcg                           itemName  mrp                      _created_at                      _updated_at
0  OzE5vaa3p7  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/shopgro-1376...  8908001921015    True              Anil puttu flour 500g   58 2016-10-02 13:49:03.281000+00:00 2017-02-22 08:48:09.548000+00:00
1  ENPCL8ph1p  [{'__type': 'Pointer', 'className': 'Category', 'objectI...      True  https://firebasestorage.googleapis.com/v0/b/kirananearby...  8901725181222    True  Yippee Magic Masala Noodles, 70 G   12 2016-10-02 13:49:03.284000+00:00 2017-02-22 08:48:09.074000+00:00

Upvotes: 21

Shakeel
Shakeel

Reputation: 2035

I am not sure if I clearly understood your question, you just trying to read json data ?

I just collected your sample data into list as shown below

[
  {
   "_id": "OzE5vaa3p7",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "nebCwWd2Fr"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/shopgro-1376.appspot.com/o/Barcode%20Data%20Upload%28II%29%2FAnil_puttu_flour_500g.png?alt=media&token=9cf63197-0925-4360-a31a-4675f4f46ae2",
   "barcode": "8908001921015",
   "isFmcg": true,
   "itemName": "Anil puttu flour 500g",
   "mrp": 58,
   "_created_at": "2016-10-02T13:49:03.281Z",
   "_updated_at": "2017-02-22T08:48:09.548Z"
},
{
   "_id": "ENPCL8ph1p",
   "categories": [
      {
         "__type": "Pointer",
         "className": "Category",
         "objectId": "B4nZeUHmVK"
      }
   ],
   "isActive": true,
   "imageUrl": "https://firebasestorage.googleapis.com/v0/b/kirananearby-9eaa8.appspot.com/o/Barcode%20data%20upload%2FYippee_Magic_Masala_Noodles,_70_g.png?alt=media&token=d9e47bd7-f847-4d6f-9460-4be8dbcaae00",
   "barcode": "8901725181222",
   "isFmcg": true,
   "itemName": "Yippee Magic Masala Noodles, 70 G",
   "mrp": 12,
   "_created_at": "2016-10-02T13:49:03.284Z",
   "_updated_at": "2017-02-22T08:48:09.074Z"
}
]

and ran this code

import pandas as pd
df = pd.read_json('Data.json')
print(df)

Output:-

              _created_at ... mrp
0 2016-10-02 13:49:03.281 ...  58
1 2016-10-02 13:49:03.284 ...  12

[2 rows x 10 columns]

Upvotes: 1

Mingming
Mingming

Reputation: 382

you should be ensure that the terminal directory is the same with the file directory (when this error occurs for me, because I used vscode, is means for me that the terminal directory in the vscode is not the same with my python file that I want to execute)

Upvotes: 6

matt.robinson17
matt.robinson17

Reputation: 39

I dont think this would be the problem as it should be the default (I think). But have you tried this? Adding an 'r' to specify the file is read only.

import json with open('gdb.json', 'r') as datafile: data = json.load(datafile) retail = pd.DataFrame(data)

Upvotes: 2

Related Questions