James Dalton
James Dalton

Reputation: 23

Python JSON Parsing fails because of \" in the text

How do I make Python parse JSON correctly when there is " in the text?

json_data = """{
  "*": {
    "picker": {
      "old": 49900,
      "description": "Meaning \"sunshine\" and \r\n- cm."
    }
  }
}"""
clean_json = json_data.replace("\r","").replace("\n","")
print(clean_json)
data_dict = json.loads(clean_json)
pprint(data_dict)

If I do .replace("\"","") then it will match all " in the JSON and that will not work either.

Please help!

Upvotes: 0

Views: 65

Answers (2)

Masklinn
Masklinn

Reputation: 42342

Since you're embedding JSON in a Python string literal, it's applying Python's escaping rules first, when the Python code and thus the string literal get parsed.

Meaning first \" is interpreted at the Python level yielding a single ", then this single " is parsed as JSON and fails.

You need to either:

  • escape the \ such that it is correctly interpreted as an actual \ character in the resulting string (just double it)
  • or use rawstrings (just prefix the triple-quoted string by \), this disables most escaping, it's generally used for regular expression string literals as they use \ a lot, but they're also suitable for JSON string literals and other embeddings

Your version:

>>> loads("""{
...   "*": {
...     "picker": {
...       "old": 49900,
...       "description": "Meaning \"sunshine\" and \r\n- cm."
...     }
...   }
... }""")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 5 column 32 (char 78)

escaping the escapes:

>>> loads("""{
...   "*": {
...     "picker": {
...       "description": "Meaning \\"sunshine\\" and \\r\\n- cm."
...     }
...   }
... }""")
{'*': {'picker': {'description': 'Meaning "sunshine" and \r\n- cm.'}}}

rawstring:

>>> loads(r"""{
...   "*": {
...     "picker": {
...       "description": "Meaning \"sunshine\" and \r\n- cm."
...     }
...   }
... }""")
{'*': {'picker': {'description': 'Meaning "sunshine" and \r\n- cm.'}}}

Upvotes: 3

litreily
litreily

Reputation: 1

I guess you need to add a prefix r before strings

import json

json_data = r"""
{
    "*": {
        "picker": {
            "old": 49900,
            "description": "Meaning \"sunshine\" and \r\n- cm."
        }
    }
}
"""

clean_json = json_data.replace(r"\r","").replace(r"\n","").replace(r'\"',"")
print(clean_json)
data_dict = json.loads(clean_json)
print(data_dict)

the output is

{
    "*": {
        "picker": {
            "old": 49900,
            "description": "Meaning sunshine and - cm."
        }
    }
}

{'*': {'picker': {'old': 49900, 'description': 'Meaning sunshine and - cm.'}}}

Upvotes: 0

Related Questions