Almira Bojani
Almira Bojani

Reputation: 579

XML parsing problem in Python - cannot find findtext

I tried to parse XML with code

import urllib2
from xml.etree import ElementTree

if __name__ == '__main__':
    print 'hello'
    result = urllib2.urlopen('http://localhost/conf.xml').read()
    xml = ElementTree.fromstring(result)
    print result
    print xml.findtext('.//type')

When I print resut I get all xml file and that is ok, but last line ( xml.findtext) always returns None ( I have tag with type and value mstp ). Can anybody help me with this ? I lloked at StackOverflow

How to parse xml in Python on Google App Engine but I dont get results with (.//type ).

There is xml file

<router>
  <datalink
    type="mstp"
    network="13"
    mac="18"
    hopcount="8">
    <mqueue
      name="/mstp1"
      msgnum="10"
      msgsize="768"
    />
    <mstp
      port="/dev/ttySx"
      baud="9600|19200|38400|76800"
      Nmax_info_frames="1+"
      Nmax_master="127-"
      Npoll="50"
      Nretry_token="1"
      Nmin_octets="4"
      Tframe_abort="60-100"
      Tframe_gap="20"
      Tno_token="500"
      Tpostdrive="15"
      Treply_delay="250"
      Treply_timeout="255-300"
      Troff="29-40"
      Tslot="10"
      Tturnaround="40"
      Tusage_delay="15"
      Tusage_timeout="20-100"
    />
  </datalink>
  <datalink
    type="bip"
    network="12"
    mac="192.168.0.146:47808"
    hopcount="8"
    >
    <mqueue
      name="/bip1"
      msgnum="10"
      msgsize="2048"
    />
    <bip
      bbmd="address|self|none"
      bmask="bmask"
    >
      <bbmd
    edit="yes|no"> <!-- dozvoljeno menjanje tabele -->
    <bdt address="192.168.0.131:0xBAC0:192.168.0.255"/> <!-- adresa:port:bmask -->
    <bdt address="192.168.0.157:0xBAC0:192.168.0.255"/>
      </bbmd>
    </bip>
  </datalink>
  <network
    unavailable="90%"
    available="40%"
    hop-dec="1">
    <mqueue
      name="/network"
      msgnum="40"
      msgsize="2048"
    />
    <!--  -->
    <hrpolicy
      general="ignore|activate|performance|demand"
      performance="num"
      conntime="num"
    />
  </network>
  <application>
    <mqueue
      name="/application"
      msgnum="10"
      msgsize="2048"
    />
  </application>
</router>

Upvotes: 1

Views: 2659

Answers (3)

flying sheep
flying sheep

Reputation: 8942

elementtree has the nasty habit of using namespaced identifiers. i guess that your xml file has namespaces, so your search would have to look something like this:

print xml.findtext('.//{http://really-long-namespace.uri}type')

look at this question, there are some ways to cope with this.

/edit: i posted this answer back when the xml wasn’t provided in the question.

Upvotes: 1

Abdul Kader
Abdul Kader

Reputation: 5842

In your xml type is not an tag but an attribute. findtext('.//type') looks for a tag named type anywhere in your xml. if it finds,it returns the .text() of the tag. In your xml to get the type . you can do something like this

xml = ElementTree.fromstring(result)
datalink = xml.find('.//datalink')
type = datalink.get('type')

Upvotes: 2

John Machin
John Machin

Reputation: 82934

It's nothing to do with namespaces.

Problem 1: type is NOT a tag, it is an attribute of the element whose tag is datalink.

Problem 2: xml.findtext() returns the text component of the element; that's NOT what you want.

What you do want is this:

elem = xml.find(".//datalink")
print repr(elem)
print elem.get("type")

Output:

<Element 'datalink' at 0x019D0AB8>
mstp

Upvotes: 4

Related Questions