Bengineer
Bengineer

Reputation: 7742

how can I find text in xmlns with elementtree

I have this xml:

<office:body>
<office:text>
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
</text:sequence-decls>
<text:p text:style-name="Standard">
<office:annotation>...</office:annotation>
foobar
</text:p>
</office:text>
</office:body>

I want to find the text "foobar" with elementtree since instead of "foobar" can be any text?

Upvotes: 0

Views: 792

Answers (1)

mzjn
mzjn

Reputation: 50947

Assume that the XML document looks like this (with declared namespaces):

<office:document-content xmlns:office="http://openoffice.org/2000/office"
                         xmlns:text="http://openoffice.org/2000/text">

  <office:body>
    <office:text>
      <text:sequence-decls>
        <text:sequence-decl text:display-outline-level="0" text:name="Illustration"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Table"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Text"/>
        <text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
      </text:sequence-decls>
      <text:p text:style-name="Standard">
        <office:annotation>...</office:annotation>
        foobar
      </text:p>
    </office:text>
  </office:body>

</office:document-content>

You can then get the "foobar" string using this program:

from xml.etree import ElementTree as ET

root = ET.parse("foobar.xml")
ann = root.find(".//{http://openoffice.org/2000/office}annotation")
print ann.tail.strip()

Here, the ElementTree.find() method is used to find the office:annotation element and the Element.tail attribute returns the text content after the element's end-tag.

Upvotes: 1

Related Questions