dobbs
dobbs

Reputation: 1043

Python ElementTree find not working

I am new to ElementTree. I'm trying to snag the <sid> value from an XML response.

The following code is not working for me. How do I extract the value in <sid> ? I am not sure where the number 53 is coming from here.

    ...
    r = requests.post(self.dispatchurl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
    print r.content
    tree = ET.ElementTree(r.content)
    print tree.find('sid')

output:

/usr/bin/python2.7 /home/myuser/PycharmProjects/autoshun/shunlibs/SplunkSearch.py
<?xml version="1.0" encoding="UTF-8"?>
<response>
  <sid>super__awesome__search__searchname_at_1489433276_24700</sid>
</response>

53

Process finished with exit code 0

Upvotes: 4

Views: 7318

Answers (4)

Senthilkumar Gopal
Senthilkumar Gopal

Reputation: 419

Following on the comment to use namespace, I printed out the element in question which is typically in the format <Element '{urn:ebay:apis:eBLBaseComponents}UploadSiteHostedPicturesResponse' at 0x7f4e9409c590>. Using this namespace, I was able to perform the find() via responseXml.find('{urn:ebay:apis:eBLBaseComponents}SiteHostedPictureDetails').find('{urn:ebay:apis:eBLBaseComponents}FullURL').text. The reference used was https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces

Upvotes: 0

Dominik
Dominik

Reputation: 1471

I know the OP is almost 2 years old, however I came across the same issue and found the solution that worked for me. What worked was removing the namespace right after the parse call which loads the XML file.

A function like this:

def remove_namespace(xmldoc, namespace):
    ns = u'{%s}' % namespace
    nsl = len(ns)
    for elem in xmldoc.getiterator():
        if elem.tag.startswith(ns):
            elem.tag = elem.tag[nsl:]

I have discovered that ElementTree has issues with namespaces, you either strip them out using a function like the one above, or you have to pass the explicit namespace to the find or findall functions.

Upvotes: 3

Bill Bell
Bill Bell

Reputation: 21643

As an alternative, you can use xpath.

>>> import xml.etree.ElementTree as ET
>>> xml = '''\
... <?xml version="1.0" encoding="UTF-8"?>
... <response>
...     <sid>super__awesome__search__searchname_at_1489433276_24700</sid>
... </response>
... '''
>>> root = ET.fromstring(xml)
>>> for sid in root.iterfind('.//sid'):
...     sid.text
... 
'super__awesome__search__searchname_at_1489433276_24700'
>>> 

Upvotes: 0

dobbs
dobbs

Reputation: 1043

the following code worked for me:

    r = requests.post(self.dispatchurl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd), stream=True)
    root = ET.fromstring(r.content)
    for i in root.iter('response'):
        print i.find('sid').text

Upvotes: 1

Related Questions