Reputation: 7742
I have this xml:
<office:body>
<office:text>
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
</text:sequence-decls>
<text:p text:style-name="Standard">
<office:annotation>...</office:annotation>
foobar
</text:p>
</office:text>
</office:body>
I want to find the text "foobar" with elementtree since instead of "foobar" can be any text?
Upvotes: 0
Views: 792
Reputation: 50947
Assume that the XML document looks like this (with declared namespaces):
<office:document-content xmlns:office="http://openoffice.org/2000/office"
xmlns:text="http://openoffice.org/2000/text">
<office:body>
<office:text>
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
</text:sequence-decls>
<text:p text:style-name="Standard">
<office:annotation>...</office:annotation>
foobar
</text:p>
</office:text>
</office:body>
</office:document-content>
You can then get the "foobar" string using this program:
from xml.etree import ElementTree as ET
root = ET.parse("foobar.xml")
ann = root.find(".//{http://openoffice.org/2000/office}annotation")
print ann.tail.strip()
Here, the ElementTree.find()
method is used to find the office:annotation
element and the Element.tail
attribute returns the text content after the element's end-tag.
Upvotes: 1