Reputation: 13
When convert the properties to JSON it added extra backslash in ASCII character, How to avoid this, see the code below
Input File (sample.properties)
property.key.CHOOSE=\u9078\u629e
Code
import json
def convertPropertiesToJson(fileName, outputFileName, sep='=', comment_char='#'):
props = {}
with open(fileName, "r") as f:
for line in f:
l = line.strip()
if l and not l.startswith(comment_char):
innerProps = {}
keyValueList = l.split(sep)
key = keyValueList[0].strip()
keyList = key.split('.')
value = sep.join(keyValueList[1:]).strip()
if keyList[1] not in props:
props[keyList[1]] = {}
innerProps[keyList[2]] = value
props[keyList[1]].update(innerProps)
with open(outputFileName, 'w') as outfile:
json.dump(props, outfile)
convertPropertiesToJson("sample.properties", "sample.json")
Output: (sample.json)
{"key": {"CHOOSE": "\\u9078\\u629e"}}
Expected Result:
{"key": {"CHOOSE": "\u9078\u629e"}}
Upvotes: 1
Views: 982
Reputation: 508
The problem seems to be that you have saved unicode characters which are represented as escaped strings. You should decode them at some point.
Changing
l = line.strip()
to (for Python 2.x)
l = line.strip().decode('unicode-escape')
to (for Python 3.x)
l = line.strip().encode('ascii').decode('unicode-escape')
gives the desired output:
{"key": {"CHOOSE": "\u9078\u629e"}}
Upvotes: 0
Reputation: 17342
The problem is the input is read as-is, and \u
is copied literally as two characters. The easiest fix is probably this:
with open(fileName, "r", encoding='unicode-escape') as f:
This is will decode the escaped unicode characters.
Upvotes: 2
Reputation: 11560
I don't know solution to your problem but I found out where problem occurs.
with open('sample.properties', encoding='utf-8') as f:
for line in f:
print(line)
print(repr(line))
d = {}
d['line'] = line
print(d)
out:
property.key.CHOOSE=\u9078\u629e
'property.key.CHOOSE=\\u9078\\u629e'
{'line': 'property.key.CHOOSE=\\u9078\\u629e'}
I don't know how adding to dictionary adds repr of string.
Upvotes: 0