Reputation: 1
I'm trying to transform a string that contains a dict to a dict object using json. But in the data contains a " example
string = '{"key1":"my"value","key2":"my"value2"}'
js = json.loads(s,strict=False)
it outputs json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 13 (char 12) as " is a delimiter and there is too much of it
What is the best way to achieve my goal ?
The solution I have found is to perform several .replace on the string to replace legit " by a pattern until only illgal " remains then replace back the pattern by the legit " After that I can use json.loads and then replace the remaining pattern by the illegal " But there must be another way
ex :
string = '{"key1":"my"value","key2":"my"value2"}'
string = string.replace('{"','__pattern_1')
string = string.replace('}"','__pattern_2')
...
...
string = string.replace('"','__pattern_42')
string = string.replace('__pattern_1','{"')
string = string.replace('__pattern_2','}"')
...
...
js = json.loads(s,strict=False)
Upvotes: 0
Views: 776
Reputation: 56
Problem is, that the string contains invalid json format.
String '{"key1": "my"value", "key2": "my"value2"}'
: value of key1
ends with "my"
and additional characters value"
are against the format.
You can use character escaping, valid json would look like:
{"key1": "my\"value", "key2": "my\"value2"}
.
Since you are defining it as value you would then need to escape the escape characters:
string = '{"key1": "my\\"value", "key2": "my\\"value2"}'
There is a lot of educative material online on character escaping. I recommend to check it out if something is not clear
Edit: If you insist on fixing the string in code (which I don't recommend, see comment) you can do something like this:
import re
import json
string = '{"key1":"my"value","key2":"my"value2"}'
# finds contents of keys and values, assuming that the key the real key/value ending double quotes
# followed by one of following characters: ,}:]
m = re.finditer(r'"([^:]+?)"(?:[,}:\]])', string)
new_string = string
for i in reversed(list(m)):
ss, se = i.span(1) # first group holds the content
# escape double backslashes in the content and add all back together
# note that this is not effective. Bigger amounts of replacements would require other approach of concatanation
new_string = new_string[:ss] + new_string[ss:se].replace('"', '\\"') + new_string[se:]
json.loads(new_string)
This assumes that the real ending double quotes are followed by one of ,:}]
. In other cases this won't work
Upvotes: 0
Reputation: 11
This should work. What I am doing here is to simply replace all the expected double quotes with something else and then remove the unwanted double quotes. and then convert it back.
import re
import json
def fix_json_string(st):
st = re.sub(r'","',"!!",st)
st = re.sub(r'":"',"--",st)
st = re.sub(r'{"',"{{",st)
st = re.sub(r'"}',"}}",st)
st = st.replace('"','')
st = re.sub(r'}}','"}',st)
st = re.sub(r'{{','{"',st)
st = re.sub(r'--','":"',st)
st = re.sub(r'!!','","',st)
return st
broken_string = '{"key1":"my"value","key2":"my"value2"}'
fixed_string = fix_json_string(broken_string)
print(fixed_string)
js = json.dumps(eval(fixed_string))
print(js)
Output -
{"key1":"myvalue","key2":"myvalue2"} # str
{"key1": "myvalue", "key2": "myvalue2"} # converted to json
Upvotes: 1
Reputation: 116
The variable string
is not a valid JSON string.
The correct string should be:
string = '{"key1":"my\\"value","key2":"my\\"value2"}'
Upvotes: 0