Laurent LAPORTE
Laurent LAPORTE

Reputation: 22952

Prepend or append PI before / after the root element with lxml

With lxml, how can I prepend processing instructions before the root element or append PIs after de root element with lxml.

Currently, the following example doesn't work:

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root))

I get:

<ROOT/>

Instead of:

<?foo?><ROOT/>

Upvotes: 3

Views: 610

Answers (2)

Tupteq
Tupteq

Reputation: 3095

You need to use ElementTree, not just Element in tounicode():

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
print(etree.tounicode(root.getroottree()))

Output is almost what you wanted:

<?foo ?><ROOT/>

Extra space character after foo showed up because lxml renders PI as pi.target + " " + pi.text.

Upvotes: 2

Laurent LAPORTE
Laurent LAPORTE

Reputation: 22952

Actually, an Element is always attached to a ElementTree even if it looks "detached":

root = etree.XML("<ROOT/>")
assert root.getroottree() is not None

When we use addprevious/addnext to insert a processing instruction before/after a root element, the PIs are not attached to a parent element (there isn't any) but they are attached to the root tree instead.

So, the problem lies in the usage of tounicode (or tostring). The best practice is to print the XML of the root tree, not the root element.

from lxml import etree

root = etree.XML("<ROOT/>")
root.addprevious(etree.ProcessingInstruction("foo"))
root.addnext(etree.ProcessingInstruction("bar"))

print(etree.tounicode(root))
# => "<ROOT/>"

print(etree.tounicode(root.getroottree()))
# => "<?foo ?><ROOT/><?bar ?>"

Upvotes: 2

Related Questions