Reputation: 2364
I have the following XML which I have parsed from a webpage:
<!--
Parts from the iGEM Registry of Standard Biological Parts
-->
<rsbpml>
<part_list>
<part>
<part_id>151</part_id>
<part_name>BBa_B0034</part_name>
<part_short_name>B0034</part_short_name>
<part_short_desc>RBS (Elowitz 1999) -- defines RBS efficiency</part_short_desc>
<part_type>RBS</part_type>
<release_status>Released HQ 2013</release_status>
<sample_status>In stock</sample_status>
And I want to extract some of the values.
For example I want to ouput the value RBS
from <part_type>
.
I've tried the following:
bb_xml_raw = urllib2.urlopen("http://parts.igem.org/cgi/xml/part.cgi?part=BBa_B0034")
self.parse = ET.parse(bb_xml_raw)
self.root = self.parse.getroot()
for part in self.root.findall('part_list'):
print part.find('part_type').text
But it doesn't work, I get: AttributeError: 'NoneType' object has no attribute 'text'
What am I doing wrong?
Upvotes: 2
Views: 2677
Reputation: 12401
Try changing
for part in self.root.findall('part_list'):
to
for part in self.root.find('part_list'):
findall
returns a list of all the nodes that match. So, the first line returns a list of all the part_list
nodes. Your <part_list>
node doesn't have any children with the tag part_type
, so it returns None
, and you get your error.
If you have a single node part_list
, then find
will return the actual node, and you can use the normal for part in
syntax to walk over all of its subnodes instead.
If you have multiple part_list
tags, then you just need a nested for loop:
for part_list in self.root.findall('part_list'):
for part in part_list:
etc.
Edit: Given that this was sort of an XY problem - if what you are looking for is really a particular subpath, you can do that all at once, like this:
all_parts = self.root.findall('part_list/part')
print all_parts[0].find('part_type').tag
etc.
Upvotes: 2