Parsing Node Value of XML in Python with ElementTree

Question

I have the following XML which I have parsed from a webpage:



 
  
   151
   BBa_B0034
   B0034
   RBS (Elowitz 1999) -- defines RBS efficiency
   RBS
   Released HQ 2013
   In stock

And I want to extract some of the values.

For example I want to ouput the value RBS from .

I've tried the following:

bb_xml_raw = urllib2.urlopen("http://parts.igem.org/cgi/xml/part.cgi?part=BBa_B0034")
self.parse = ET.parse(bb_xml_raw)
self.root = self.parse.getroot()

for part in self.root.findall('part_list'):
   print part.find('part_type').text

But it doesn't work, I get: AttributeError: 'NoneType' object has no attribute 'text'

What am I doing wrong?

Corley Brigman · Accepted Answer

Try changing

for part in self.root.findall('part_list'):

to

for part in self.root.find('part_list'):

findall returns a list of all the nodes that match. So, the first line returns a list of all the part_list nodes. Your node doesn't have any children with the tag part_type, so it returns None, and you get your error.

If you have a single node part_list, then find will return the actual node, and you can use the normal for part in syntax to walk over all of its subnodes instead.

If you have multiple part_list tags, then you just need a nested for loop:

for part_list in self.root.findall('part_list'):
    for part in part_list: 
         etc.

Edit: Given that this was sort of an XY problem - if what you are looking for is really a particular subpath, you can do that all at once, like this:

all_parts = self.root.findall('part_list/part')
print all_parts[0].find('part_type').tag

etc.

Parsing Node Value of XML in Python with ElementTree

Answers (1)

Related Questions