Reputation: 41
I have a JSON file which looks like:
[
{
"story_id": xx,
"line_number": 109,
"sentence": "fhsabdajbndkjlabhfegbdajbdhj",
"ner": "{'gfjghj': 'PERSON', 'hjbhjb': 'DATE'}",
"PROPN": "['vhjb', 'ghjhb']",
"Best": 1
}
]
I want to find starting and ending "(double-quotes) and replace with empty, in tags ner and PROPN.
The output should be a json file, and data should be like:
[
{
"story_id": xx,
"line_number": 109,
"sentence": "fhsabdajbndkjlabhfegbdajbdhj",
"ner": {'gfjghj': 'PERSON', 'hjbhjb': 'DATE'},
"PROPN": ['vhjb', 'ghjhb'],
"Best": 1
}
]
I tried this:
import json
with open('path/to/file.json','r',encoding ='utf-8') as f:
data = json.load(f)
for item in data:
item['ner'] = item['ner'].replace('"{', '{').replace('}"', '}').replace('"[', '[').replace(']"', ']')
with open('path/to/output_file.json'', 'w') as f:
json.dump(data, f)
While running this, I'm getting "UnicodeDecodeError".
Can any one help for these?
Thanks in Advance.
Upvotes: 0
Views: 106
Reputation: 82785
Use ast
module
Ex:
import json
with open('path/to/file.json','r',encoding ='utf-8') as f:
data = json.load(f)
for item in data:
item['ner'] = ast.literal_eval(item['ner'])
with open('path/to/output_file.json'', 'w') as f:
json.dump(data, f)
Upvotes: 1