JoJo
JoJo

Reputation: 1447

The loading issue for json file in python

The json file content as follows:

{"votes": {"funny": 0, "useful": 5, "cool": 2}, "user_id": "rLtl8ZkDX5vH5nAx9C3q5Q", "review_id": "fWKvX83p0-ka4JS3dc6E5A", "stars": 5, "date": "2011-01-26", "text": "My wife took me here on my birthday for breakfast and it was excellent.  It looked like the place fills up pretty quickly so the earlier you get here the better.\n\nDo yourself a favor and get their Bloody Mary. It came with 2 pieces of their griddled bread with was amazing and it absolutely made the meal complete.  It was the best \"toast\" I've ever had.\n\nAnyway, I can't wait to go back!", "type": "review", "business_id": "9yKzy9PApeiPPOUJEtnvkg"}
{"votes": {"funny": 0, "useful": 0, "cool": 0}, "user_id": "0a2KyEL0d3Yb1V6aivbIuQ", "review_id": "IjZ33sJrzXqU-0X6U8NwyA", "stars": 5, "date": "2011-07-27", "text": "I have no idea why some people give bad reviews about this place. It goes to show you, you can please everyone. They are probably griping about something that their own fault... but they said we'll be seated when the girl comes back from seating someone else. So, everything was great and not like these bad reviewers. That goes to show you that  you have to try these things yourself because all these bad reviewers have some serious issues.", "type": "review", "business_id": "ZRJwVLyzEJq1VAihDhYiow"}

my code is:

import json
from pprint import pprint
review = open('/User/Desktop/python/test.json')
data = json.load(review)
pprint(data["votes"])

The error is:

Traceback (most recent call last):
  File "/Users/hadoop/Documents/workspace/dataming-course/src/Yelp/main.py", line 8, in <module>
    data = json.load(review)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 278, in load
    **kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 326, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 363, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 3 column 1 (char 623 - 1294)

Upvotes: 1

Views: 454

Answers (3)

summea
summea

Reputation: 7583

For what it's worth, you could try putting your JSON into an array, like this:

[ { "business_id" : "9yKzy9PApeiPPOUJEtnvkg",
    "date" : "2011-01-26",
    "review_id" : "fWKvX83p0-ka4JS3dc6E5A",
    "stars" : "5",
    "text" : "My wife took me here on my birthday for breakfast and it was excellent.  It looked like the place fills up pretty quickly so the earlier you get here the better.\n\nDo yourself a favor and get their Bloody Mary. It came with 2 pieces of their griddled bread with was amazing and it absolutely made the meal complete.  It was the best \"toast\" I've ever had.\n\nAnyway, I can't wait to go back!",
    "type" : "review",
    "user_id" : "rLtl8ZkDX5vH5nAx9C3q5Q",
    "votes" : { "cool" : "2",
        "funny" : "0",
        "useful" : "5"
      }
  },
  { "business_id" : "ZRJwVLyzEJq1VAihDhYiow",
    "date" : "2011-07-27",
    "review_id" : "IjZ33sJrzXqU-0X6U8NwyA",
    "stars" : "5",
    "text" : "I have no idea why some people give bad reviews about this place. It goes to show you, you can please everyone. They are probably griping about something that their own fault... but they said we'll be seated when the girl comes back from seating someone else. So, everything was great and not like these bad reviewers. That goes to show you that  you have to try these things yourself because all these bad reviewers have some serious issues.",
    "type" : "review",
    "user_id" : "0a2KyEL0d3Yb1V6aivbIuQ",
    "votes" : { "cool" : "0",
        "funny" : "0",
        "useful" : "0"
      }
  }
]

(And do note the , that separates the two "main" parts of the JSON array :)

Upvotes: 1

JBernardo
JBernardo

Reputation: 33397

If you can't change the input file, you may use JSONDecoder.raw_decode to do it in chunks.

>>> dec = json.JSONDecoder()
>>> dec.raw_decode('["a",1]{"foo": 2}')
(['a', 1], 7)
>>> dec.raw_decode('["a",1]{"foo": 2}', 7)
({'foo': 2}, 17)

You will need to read the file to a string first.

Upvotes: 2

John Zwinck
John Zwinck

Reputation: 249123

You have two JSON documents in a single file. Consider putting them into an array or something. The top-level of the file should only contain a single element.

Upvotes: 5

Related Questions