Reputation: 26442
I'm trying to parse json string with an escape character (Of some sort I guess)
{
"publisher": "\"O'Reilly Media, Inc.\""
}
Parser parses well if I remove the character \"
from the string,
the exceptions raised by different parsers are,
json
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 17 column 20 (char 392)
ujson
ValueError: Unexpected character in found when decoding object value
How do I make the parser to escape this characters ?
update:
ps. json is imported as ujson in this example
This is what my ide shows
comma is just added accidently, it has no trailing comma at the end of json, json is valid
the string definition.
Upvotes: 2
Views: 6653
Reputation: 4568
Your JSON is invalid. If you have questions about your JSON objects, you can always validate them with JSONlint. In your case you have an object
{
"publisher": "\"O'Reilly Media, Inc.\"",
}
and you have an extra comma indicating that something else should be coming. So JSONlint yields
Parse error on line 2: ...edia, Inc.\"", } ---------------------^ Expecting 'STRING'
which would begin to help you find where the error was.
Removing the comma for
{
"publisher": "\"O'Reilly Media, Inc.\""
}
yields
Valid JSON
Update: I'm keeping the stuff in about JSONlint as it may be helpful to others in the future. As for your well formed JSON object, I have
import json
d = {
"publisher": "\"O'Reilly Media, Inc.\""
}
print "Here is your string parsed."
print(json.dumps(d))
yielding
Here is your string parsed. {"publisher": "\"O'Reilly Media, Inc.\""}
Process finished with exit code 0
Upvotes: 2
Reputation: 1122492
You almost certainly did not define properly escaped backslashes. If you define the string properly the JSON parses just fine:
>>> import json
>>> json_str = r'''
... {
... "publisher": "\"O'Reilly Media, Inc.\""
... }
... ''' # raw string to prevent the \" from being interpreted by Python
>>> json.loads(json_str)
{u'publisher': u'"O\'Reilly Media, Inc."'}
Note that I used a raw string literal to define the string in Python; if I did not, the \"
would be interpreted by Python and a regular "
would be inserted. You'd have to double the backslash otherwise:
>>> print '\"'
"
>>> print '\\"'
\"
>>> print r'\"'
\"
Reencoding the parsed Python structure back to JSON shows the backslashes re-appearing, with the repr()
output for the string using the same double backslash:
>>> json.dumps(json.loads(json_str))
'{"publisher": "\\"O\'Reilly Media, Inc.\\""}'
>>> print json.dumps(json.loads(json_str))
{"publisher": "\"O'Reilly Media, Inc.\""}
If you did not escape the \
escape you'll end up with unescaped quotes:
>>> json_str_improper = '''
... {
... "publisher": "\"O'Reilly Media, Inc.\""
... }
... '''
>>> print json_str_improper
{
"publisher": ""O'Reilly Media, Inc.""
}
>>> json.loads(json_str_improper)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 3 column 20 (char 22)
Note that the \"
sequences now are printed as "
, the backslash is gone!
Upvotes: 9