Reputation: 961
I have the following kind of data returned to me in xml format (many rooms are returned; this is one example of the data I get back):
<?xml version="1.0" encoding="UTF-8"?>
<rooms>
<total-results>1</total-results>
<items-per-page>1</items-per-page>
<start-index>0</start-index>
<room>
<id>xxxxxxxx</id>
<etag>5</etag>
<link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy"/>
<name>1.306</name>
<status>active</status>
<link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa">
</room>
</rooms>
If nodeType == node.TEXT_NODE, I seem to be able to access the data (so I can see that I have room 1.306). Also, I seem to be able to access the nodeName link, but I really need to know if that room is in one of my acceptable buildings, so I need to be able to get to the rest of that line to look at the yyyyyyyyy. Can someone please advise?
OK, @vezult, here is what I finally came up with (working code!) using ElementTree, as you suggested. This is probably not the most pythonic (or ElementTree-ic?) way of doing this, but it seems to work. I'm thrilled to have access now to .tag, .attrib, and .text of every piece of my xml. I welcome any advice on how to make it better.
# We start out knowing our room name and our building id. However, the same room can exist in many buildings.
# Examine the rooms we've received and get the id of the one with our name that is also in our building.
# Query the API for a list of rooms, getting u back.
request = build_request(resourceUrl)
u = urllib2.urlopen(request.to_url())
mydata = u.read()
root = ElementTree.fromstring(mydata)
print 'tree root', root.tag, root.attrib, root.text
for child in root:
if child.tag == 'room':
for child2 in child:
# the id tag comes before the name tag, so hold on to it
if child2.tag == "id":
hold_id = child2.text
# the building link comes before the room name, so hold on to it
if child2.tag == 'link': # if this is a link
if "building" in child2.attrib['href']: # and it's a building link
hold_link_data = child2.attrib['href']
if child2.tag == 'name':
if (out_bldg in hold_link_data and # the building link we're looking at has our building in it
(in_rm == child2.text)): # and this room name is our room name
out_rm = hold_id
break # get out of for-loop
Upvotes: 0
Views: 966
Reputation: 5243
You provide no indication of what library you are using, so I'm assuming you are using the standard python ElementTree
module. In that case, do the following:
from xml.etree import ElementTree
tree = ElementTree.fromstring("""<?xml version="1.0" encoding="UTF-8"?>
<rooms>
<total-results>1</total-results>
<items-per-page>1</items-per-page>
<start-index>0</start-index>
<room>
<id>xxxxxxxx</id>
<etag>5</etag>
<link rel="http://schemas.com.mysite.building" title="building" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/buildings/yyyyyyyyy" />
<name>1.306</name>
<status>active</status>
<link rel="self" title="self" href="https://mysite.me.myschool.edu:8443/ess/scheduleapi/v1/rooms/aaaaaaaaa" />
</room>
</rooms>
""")
# Select the first link element in the example XML
for node in tree.findall('./room/link[@title="building"]'):
# the 'attrib' attribute is a dictionary containing the node attributes
print node.attrib['href']
Upvotes: 3