Reputation: 238
I receive a JSON object like this:
{
"Question Communicating": "Natural language",
"interpretation_type": "recognition",
"output1": "test",
"Question Learning": "Reinforcement",
"output2": "test2",
"output3": "something"
}
My question is, is it possible to rename the key name: 'outputX'
to 'output'
.
I don't know how many times 'outputX'
will be in the JSON, but I need all the outputs renamed to 'output'
.
So it will end up like this:
{
"Question Communicating": "Natural language",
"interpretation_type": "recognition",
"output": "test",
"Question Learning": "Reinforcement",
"output": "test2",
"output": "something"
}
Upvotes: 2
Views: 596
Reputation: 530930
One possibility is to use a data structure that allows duplicate keys, such as webob.multidict.Multidict
.
import webob.multidict
import json
class MultiDictEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, webob.multidict.MultiDict):
return o
else:
return super().default(o)
def encode(self, o):
if isinstance(o, webob.multidict.MultiDict):
# Just a proof of concept. No attempt is made
# to properly encode keys for values.
return ('{'
+ ', '.join(f'"{k}": "{v}"' for k, v in o.items())
+ '}')
else:
return super().encode(o)
with open("tmp1.json") as f:
input_data = json.load(f)
output_data = webob.multidict.MultiDict()
for k, v in input_data.items():
if k.startswith("output"):
k = 'output'
output_data.add(k, v)
with open("tmp2.json", 'w') as f:
print(json.dumps(output_data, cls=MultiDictEncoder), file=f)
For some reason in testing this, using json.dump
produced an error involving circular references. I don't know if this is a problem with how I defined MultiDictEncoder.default
, but the resulting tmp2.json
does have duplicate output
keys.
Upvotes: 0
Reputation: 26315
Trying to use duplicate keys in a JSON object is not recommended. You can see the problems that arise when you serialize and deserialize duplicate keys, or try to force them into a dictionary. The duplicate keys are not retained.
>>> from json import dumps, loads
>>> json = '{"a": "x", "a": "y"}'
>>> loads(json)
{'a': 'y'}
>>> json = {'a': 'x', 'a': 'y'}
>>> dumps(json)
'{"a": "y"}'
>>> json = {'a': 'x', 'a': 'y'}
>>> json
{'a': 'y'}
Instead you could try grouping all keys that start with "output"
into a list ["test", "test2", "something"]
.
from json import dumps
d = {
"Question Communicating": "Natural language",
"interpretation_type": "recognition",
"output1": "test",
"Question Learning": "Reinforcement",
"output2": "test2",
"output3": "something"
}
result = {}
for k, v in d.items():
if k.startswith("output"):
result.setdefault("output", []).append(v)
else:
result[k] = v
print(dumps(result, indent=4))
Output JSON:
{
"Question Communicating": "Natural language",
"interpretation_type": "recognition",
"output": [
"test",
"test2",
"something"
],
"Question Learning": "Reinforcement"
}
Upvotes: 2