Bryant Makes Programs
Bryant Makes Programs

Reputation: 1694

How do you get original text from etree object

I'm having trouble finding documentation on this.

I have an object of type lxml.etree._ElementTree and am trying to get the original text from it.

The object was generated by executing:

tree = etree.parse(content, parser=parser)

I then need to access the original content much further along in the script when content is no longer available. I would like to get that content by executing some function of tree but I can not find any documentation for this.

I'd found reference to a tostring function but that seems to be an invalid function.

Thoughts?

Upvotes: 0

Views: 154

Answers (2)

Triggernometry
Triggernometry

Reputation: 583

tostring is not a method of the tree object, but of the lxml.etree library.

So try lxml.etree.tostring(tree).

Note that this may not be EXACTLY the same as the original file - it should parse to the same XML, but spaces, newlines, and other formatting may be different. Also, if you have made any changes to the tree it will not match the original file, obviously.

Upvotes: 4

Haleemur Ali
Haleemur Ali

Reputation: 28263

tostring is a valid function, perhaps you're using it incorrectly. Here's a fully contained example:

from lxml import etree

text = """
<?xml version="1.0" ?>
<people>
  <person>
    <id>1</id>
    <name>Hal</name>
    <notes>Hal likes chocolate</notes>
  </person>
</people>"""

root = etree.fromstring(text)
etee.tostring(root)
# outputs the following
'<people>\n  <person>\n    <id>1</id>\n    <name>Hal</name>\n    <notes>Hal likes chocolate</notes>\n  </person>\n</people>'

Upvotes: 1

Related Questions