Reputation: 11377
In a Python script I make a call to a SOAP service which returns an XML reply where the elements have a namespace prefix, let's say
<ns0:foo xmlns:ns0="SOME-URI">
<ns0:bar>abc</ns0:bar>
</ns0:foo>
I can extract the content of ns0:bar with the method call
doc.getElementsByTagName('ns0:bar')
However, the name ns0 is only a local variable so to speak (it's not mentioned in the schema) and might as well have been named flubber or you_should_not_care. What is the proper way to extract the content of a namespaced element without relying on it having a specific name? In my case the prefix was indeed changed in the SOAP service which resulted in a parse failure.
Upvotes: 1
Views: 46
Reputation: 3581
If you have a soap response, you can search with a wildcard for namespace {*}tagname
:
import lxml.etree as et
xml_= """<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:tns="urn:wsNotes" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns0:foo xmlns:ns0="SOME-URI">
<ns0:bar>abc</ns0:bar>
</ns0:foo>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>"""
root = et.fromstring(xml_)
# Get namespace
ns = root.nsmap
#Search with namespace and wildcard *
bar = root.find(".//{*}bar", ns).text
# Alternativ, because ns0 isn’t in ns
# bar = root.find(".//{*}bar").text
print(bar)
Output:
abc
Upvotes: 0
Reputation: 12822
Namespace support is needed if searching by element name
doc.getElementsByTagNameNS('SOME-URI','bar')
If using a package with namespace support like lxml
tree.findall('{http://schemas.xmlsoap.org/soap/envelope/}Body')
or by local name
tree.xpath('//*[local-name()="bar"]'
lxml example
from lxml import etree
tree = etree.parse("/home/lmc/tmp/soap.xml")
tree.xpath('//*[local-name()="Company"]')
Result
[<Element {http://example.com}Company at 0x7f0959fb3fc0>]
Upvotes: 2