Reputation: 3587
I'm trying to parse a XML file using xml ElementTree and json
from xml.etree import ElementTree as et
import json
def parse_file(file_name):
tree = et.ElementTree()
npcs = {}
for npc in tree.parse(file_name):
quests = []
for quest in npc:
quest_name = quest.attrib['name']
stages = []
for i, stage in enumerate(quest):
next_stage, choice, npc_condition = None, None, None
for key, val in stage.attrib.items():
val = json.loads(val)
if key == 'choices':
choice = val
elif key == 'next_stage':
next_stage = val
elif key == 'ncp_condition':
npc_condition = {stage.attrib['npc_name']: val}
stages.append([i, next_stage, choice, npc_condition])
quests.append( {quest_name:stages})
npcs[npc.attrib['name']] = quests
return npcs
The XML file:
<?xml version="1.0" encoding="utf-8"?>
<npcs>
<npc name="NPC NAME">
<quest0 name="Quest Name here">
<stage0 choices='{"Option1":1, "Option1":2}'>
<text>text1</text>
</stage0>
<stage1 next_stage="[3,4]">
<text>text2</text>
</stage1>
<stage3 npc_name="other_npc_name" ncp_condition='{"some_condition":false}' next_stage="[3, 4]">
<text>text3</text>
</stage3>
</quest0>
</npc>
</npcs>
But I'm having trouble with this bit:
<stage3 npc_name="other_npc_name" ncp_condition='{"some_condition":false}' next_stage="[3, 4]">
Traceback:
Traceback (most recent call last):
File "C:/.../test2.py", line 28, in <module>
parse_file('quests.xml')
File "C:/.../test2.py", line 15, in parse_file
val = json.loads(val)
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python27\lib\json\decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
It raises this error in the line val = json.loads(val)
when key="npc_name"
and val="other_npc_name"
.
What's wrong with that? It didn't raise any error when name="some string"
, but it does when npc_name="some string"
.
I noticed that if I change "other_npc_name"
to '"other_npc_name"'
it doesn't complain, but this seem a bit hackish to me
Upvotes: 0
Views: 54
Reputation: 1354
JSON is a way to store data structures - thus it can only decode said data structures.
When you try to get JSON to decode something like this:
other_npc_name
JSON can't match this to any valid data type. However, if this is wrapped in quotation marks:
"other_npc_name"
JSON recognizes this as a String (as per the JSON spec, that is how a string is defined).
And this is what is happening in your script:
import json
print json.loads("other_npc_name") #throws error
print json.loads('"other_npc_name"') #returns "other_npc_name" as a Unicode string
Thus, it may seem 'hackish' to wrap the string this way, however, this is really the only way for JSON to decode it.
One potential suggestion is that if the npc_name attribute in XML is always a string, then pull it out as a string instead of trying to decode it as a JSON object.
Upvotes: 1