Reputation: 3385
I am trying to parse this json file: http://pastebin.com/VcVR0ue0
While using these modules
from pprint import pprint
import codecs
import json
file = 'Desktop10000_760_CurtSacks.json'
I've tried these methods
a)
data = data = json.load(open(file))
b)
data = json.load(codecs.open(file, encoding='utf_8_sig'))
In both cases the output has a u
inserted in front of each key-value:
{u'document_tone': {u'tone_categories': [{u'category_id': u'emotion_tone',
u'category_name': u'Emotion Tone',
u'tones': [{u'score': 0.111838,
u'tone_id': u'anger',
u'tone_name': u'Anger'},
{u'score': 0.159831,
u'tone_id': u'disgust',
u'tone_name': u'Disgust'},
{u'score': 0.17082,
u'tone_id': u'fear',
u'tone_name': u'Fear'},
{u'score': 0.507748,
u'tone_id': u'joy',
u'tone_name': u'Joy'},
{u'score': 0.520722,
u'tone_id': u'sadness',
u'tone_name': u'Sadness'}]},
How do I read the file correctly?
Upvotes: 0
Views: 51
Reputation: 81
The 'u' indicates a python unicode string - this is normal. The json library by nature returns unicode strings, so it looks like your data is being parsed properly.
If for whatever reason you don't want unicode strings in your JSON you can use yaml
import yaml
data = yaml.safe_load(open(file))
print( data )
So you'd get
{'key':'item'}
Instead of
{u'key':'item'}
Although I don't see a reason not to use unicode, as for most purposes it won't affect much. (see Python str vs unicode types)
Upvotes: 0
Reputation: 60143
It looks like everything's being parsed properly.
Python's syntax for a unicode string is:
u'Here is the string.'
So the Python equivalent of this JSON:
{"foo": "bar"}
is this:
{u'foo': u'bar'}
If you just print out the Python representation of the data, you'll see the Python syntax.
Upvotes: 1