Kin
Kin

Reputation: 4596

How to search for XML elements in python?

Сode that is shown below works perfectly, but the problem is that i need to manually set name-spaces like d:. Is it possible somehow to search for elements ignoring this name-spaces like dom.getElementsByTagName('Scopes')?

def parseSoapBody(soap_data):
    dom = parseString(soap_data)

    return {
        'scopes': dom.getElementsByTagName('d:Scopes')[0].firstChild.nodeValue,
        'address': dom.getElementsByTagName('d:XAddrs')[0].firstChild.nodeValue,
    }

Upvotes: 0

Views: 62

Answers (1)

unutbu
unutbu

Reputation: 879681

Since your code uses parseString and getElementsByTagName, I'm assuming you are using minidom. In that case, try:

dom.getElementsByTagNameNS('*', 'Scopes')

It doesn't say so in the docs, but if you look in the source code for xml/dom/minidom.py, you'll see getElementsByTagNameNS calls _get_elements_by_tagName_ns_helper which is defined like this:

def _get_elements_by_tagName_ns_helper(parent, nsURI, localName, rc):
    for node in parent.childNodes:
        if node.nodeType == Node.ELEMENT_NODE:
            if ((localName == "*" or node.localName == localName) and
                (nsURI == "*" or node.namespaceURI == nsURI)):
                rc.append(node)
            _get_elements_by_tagName_ns_helper(node, nsURI, localName, rc)
    return rc

Notice that when nsURI equals *, only the localName needs to match.


For example,

import xml.dom.minidom as minidom
content = '''<root xmlns:f="foo"><f:test/><f:test/></root>'''
dom = minidom.parseString(content)
for n in dom.getElementsByTagNameNS('*', 'test'):
    print(n.toxml())
    # <f:test/>
    # <f:test/>

Upvotes: 1

Related Questions