August Karlstrom
August Karlstrom

Reputation: 11377

Proper way to extract XML elements from a namespace

In a Python script I make a call to a SOAP service which returns an XML reply where the elements have a namespace prefix, let's say

<ns0:foo xmlns:ns0="SOME-URI">
  <ns0:bar>abc</ns0:bar>
</ns0:foo>

I can extract the content of ns0:bar with the method call

doc.getElementsByTagName('ns0:bar')

However, the name ns0 is only a local variable so to speak (it's not mentioned in the schema) and might as well have been named flubber or you_should_not_care. What is the proper way to extract the content of a namespaced element without relying on it having a specific name? In my case the prefix was indeed changed in the SOAP service which resulted in a parse failure.

Upvotes: 1

Views: 46

Answers (2)

Hermann12
Hermann12

Reputation: 3581

If you have a soap response, you can search with a wildcard for namespace {*}tagname:

import lxml.etree as et

xml_= """<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:tns="urn:wsNotes" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
                <SOAP-ENV:Body>
                <ns0:foo xmlns:ns0="SOME-URI">
                  <ns0:bar>abc</ns0:bar>
                </ns0:foo>
                </SOAP-ENV:Body>
            </SOAP-ENV:Envelope>"""

root = et.fromstring(xml_)
# Get namespace
ns = root.nsmap

#Search with namespace and wildcard *
bar = root.find(".//{*}bar", ns).text
# Alternativ, because ns0 isn’t in ns
# bar = root.find(".//{*}bar").text
print(bar)

Output:

abc

Upvotes: 0

LMC
LMC

Reputation: 12822

Namespace support is needed if searching by element name

doc.getElementsByTagNameNS('SOME-URI','bar')

If using a package with namespace support like lxml

tree.findall('{http://schemas.xmlsoap.org/soap/envelope/}Body')

or by local name

   tree.xpath('//*[local-name()="bar"]'

lxml example

from lxml import etree
tree = etree.parse("/home/lmc/tmp/soap.xml")
tree.xpath('//*[local-name()="Company"]')

Result

[<Element {http://example.com}Company at 0x7f0959fb3fc0>]

Upvotes: 2

Related Questions