Reputation: 189
This XML document contains the set of tags events-data
. I want to extract information from the most RECENT events-data
. For example, in the code below I want to go to the last events-data
tag, go down to the event-date
tag and extract the text of the date
child tag. At the moment I am using BeautifulSoup in Python to traverse this document. Any ideas?
<?xml version="1.0" encoding="UTF-8"?>
<first-tag>
<second-tag>
<events-data>
<event-date>
<date>20040913</date>
</event-date>
</events-data>
<events-data> #the one i want to traverse to grab date text
<event-date>
<date>20040913</date>
</event-date>
</events-data>
</second-tag>
</first-tag>
Upvotes: 0
Views: 1129
Reputation: 3098
This is using BeautifulSoup 3
import os
import sys
# Import Custom libraries
from BeautifulSoup import BeautifulStoneSoup
xml_str = \
'''
<?xml version="1.0" encoding="UTF-8"?>
<first-tag>
<second-tag>
<events-data>
<event-date>
<date>20040913</date>
</event-date>
</events-data>
<events-data>
<event-date>
<date>20040913</date>
</event-date>
</events-data>
</second-tag>
</first-tag>
'''
soup = BeautifulStoneSoup(xml_str)
event_data_location = lambda x: x.name == "events-data"
events = soup.findAll(event_data_location)
if(events):
# The last event-data
print events[-1].text
Upvotes: 1