Reputation: 22952
With lxml, how can I prepend processing instructions before the root element or append PIs after de root element with lxml.
Currently, the following example doesn't work:
from lxml import etree
root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root))
I get:
<ROOT/>
Instead of:
<?foo?><ROOT/>
Upvotes: 3
Views: 610
Reputation: 3095
You need to use ElementTree
, not just Element
in tounicode()
:
from lxml import etree
root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root.getroottree()))
Output is almost what you wanted:
<?foo ?><ROOT/>
Extra space character after foo
showed up because lxml
renders PI
as pi.target + " " + pi.text
.
Upvotes: 2
Reputation: 22952
Actually, an Element
is always attached to a ElementTree
even if it looks "detached":
root = etree.XML("<ROOT/>")
assert root.getroottree() is not None
When we use addprevious
/addnext
to insert a processing instruction before/after a root element, the PIs are not attached to a parent element (there isn't any) but they are attached to the root tree instead.
So, the problem lies in the usage of tounicode
(or tostring
). The best practice is to print the XML of the root tree, not the root element.
from lxml import etree
root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
root.addnext(etree.ProcessingInstruction("bar"))
print(etree.tounicode(root))
# => "<ROOT/>"
print(etree.tounicode(root.getroottree()))
# => "<?foo ?><ROOT/><?bar ?>"
Upvotes: 2