Reputation: 4055
I have tho following code fragment:
from xml.etree.ElementTree import fromstring,tostring
mathml = fromstring(input)
for elem in mathml.getiterator():
elem.tag = 'm:' + elem.tag
return tostring(mathml)
When i input the following input
:
<math>
<a> 1 2 3 </a> <b />
<foo>Uitleg</foo>
<!-- <bar> -->
</math>
It results in:
<m:math>
<m:a> 1 2 3 </m:a> <m:b />
<m:foo>Uitleg</m:foo>
</m:math>
How come? And how can I preserve the comment?
edit: I don't care for the exact xml library used, however, I should be able to do the pasted change to the tags. Unfortunately, lxml does not seem to allow this (and I cannot use proper namespace operations)
Upvotes: 15
Views: 7721
Reputation: 28686
You cannot with xml.etree
, because its parser ignores comments (which is acceptable behaviour for an xml parser by the way). But you can if you use the (compatible) lxml library, which allows you to configure parser options.
from lxml import etree
parser = etree.XMLParser(remove_comments=False)
tree = etree.parse('input.xml', parser=parser)
# or alternatively set the parser as default:
# etree.set_default_parser(parser)
This would by far be the easiest option. If you really have to use xml.etree, you could try hooking up your own parser, although even then, comments are not officially supported: have a look at this example (from the author of xml.etree) (still seems to work in python 2.7 by the way)
Upvotes: 17