Reputation: 27969
If I try to parse a broken XML the exception shows the line number. Is there a way to show the XML context?
I want to see the xml tags before and after the broken part.
Example:
import xml.etree.ElementTree as ET
tree = ET.fromstring('<a><b></a>')
Exception:
Traceback (most recent call last):
File "tmp/foo.py", line 2, in <module>
tree = ET.fromstring('<a><b></a>')
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1300, in XML
parser.feed(text)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1642, in feed
self._raiseerror(v)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 1, column 8
Something like this would be nice:
ParseError:
<a><b></a>
=====^
Upvotes: 7
Views: 18301
Reputation: 880389
You could make a helper function to do this:
import sys
import io
import itertools as IT
import xml.etree.ElementTree as ET
PY2 = sys.version_info[0] == 2
StringIO = io.BytesIO if PY2 else io.StringIO
def myfromstring(content):
try:
tree = ET.fromstring(content)
except ET.ParseError as err:
lineno, column = err.position
line = next(IT.islice(StringIO(content), lineno))
caret = '{:=>{}}'.format('^', column)
err.msg = '{}\n{}\n{}'.format(err, line, caret)
raise
return tree
myfromstring('<a><b></a>')
yields
xml.etree.ElementTree.ParseError: mismatched tag: line 1, column 8
<a><b></a>
=======^
Upvotes: 16
Reputation: 7931
It's not the best option but it's easy and simple, you can just parse the ParseError
Extract the line and column and then use it to show where is the problem.
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import ParseError
my_string = '<a><b><c></b></a>'
try:
tree = ET.fromstring(my_string)
except ParseError as e:
formatted_e = str(e)
line = int(formatted_e[formatted_e.find("line ") + 5: formatted_e.find(",")])
column = int(formatted_e[formatted_e.find("column ") + 7:])
split_str = my_string.split("\n")
print "{}\n{}^".format(split_str[line - 1], len(split_str[line - 1][0:column])*"-")
Note: the \n
is just for the example you need to split it the right way.
Upvotes: 2