Reputation: 1

Python 3.4 - XML Parse - IndexError: List Index Out of Range - How do I find range of XML?

Okay guys, I'm new to parsing XML and Python, and am trying to get this to work. If someone could help me with this it would be greatly appreciated. If you can help me (educate me) on how to figure it out for myself, that would be even better!

I am having trouble trying to figure out the range to reference for an XML document as I can't find any documentation on it. Here is my code and I'll include the entire Traceback after.

#import library to do http requests:
import urllib.request

#import easy to use xml parser called minidom:
from xml.dom.minidom import parseString
#all these imports are standard on most modern python implementations

#download the file:
file = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')
#convert to string:
data = file.read()
#close file because we dont need it anymore:
file.close()
#parse the xml you downloaded
dom = parseString(data)
#retrieve the first xml tag (<tag>data</tag>) that the parser finds with name tagName:
xmlTag = dom.getElementsByTagName('Data.Results.Power.ID')[0].toxml()
#strip off the tag (<tag>data</tag>  --->   data):
xmlData=xmlTag.replace('<id>','').replace('</id>','')
#print out the xml tag and data in this format: <tag>data</tag>
print(xmlTag)
#just print the data
print(xmlData)

Traceback

/usr/bin/python3.4 /home/mint/PycharmProjects/DnD_Project/Power_Name.py
Traceback (most recent call last):
  File "/home/mint/PycharmProjects/DnD_Project/Power_Name.py", line 14, in <module>
xmlTag = dom.getElementsByTagName('id')[0].toxml()
IndexError: list index out of range

Process finished with exit code 1

Upvotes: 0

Answers (2)

furas

Reputation: 142651

print len( dom.getElementsByTagName('id') )

EDIT:

ids = dom.getElementsByTagName('id')

if len( ids ) > 0 :
     xmlTag = ids[0].toxml()
     # rest of code

EDIT: I add example because I saw in other comment tha you don't know how to use it

BTW: I add some comment in code about file/connection

import urllib.request

from xml.dom.minidom import parseString

# create connection to data/file on server
connection = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')

# read from server as string (not "convert" to string):
data = connection.read()

#close connection because we dont need it anymore:
connection.close()

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')

# check if there are any data
if len( ids ) > 0 :
    xmlTag = ids[0].toxml()
    xmlData=xmlTag.replace('<id>','').replace('</id>','')
    print(xmlTag)
    print(xmlData)
else:
    print("Sorry, there was no data")

or you can use for loop if there is more tags

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')

# get all tags - one by one
for one_tag in ids:
    xmlTag = one_tag.toxml()
    xmlData = xmlTag.replace('<id>','').replace('</id>','')
    print(xmlTag)
    print(xmlData)

BTW:

getElementsByTagName() expects tagname ID - not path Data.Results.Power.ID
tagname is ID so you have to replace <ID> not <id>
for this tag you can event use one_tag.firstChild.nodeValue in place of xmlTag.replace

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('ID') # tagname

# get all tags - one by one
for one_tag in ids:
    xmlTag = one_tag.toxml()
    #xmlData = xmlTag.replace('<ID>','').replace('</ID>','')
    xmlData = one_tag.firstChild.nodeValue
    print(xmlTag)
    print(xmlData)

Upvotes: 1

moorej

Reputation: 527

I haven't used the built in xml library in a while, but it's covered in Mark Pilgrim's great Dive into Python book.

-- I see as I'm typing this that your question has already been answered but since you mention being new to Python I think you will find the text useful for xml parsing and as an excellent introduction to the language.

If you would like to try another approach to parsing xml and html, I highly recommend lxml.

Upvotes: 0

Python 3.4 - XML Parse - IndexError: List Index Out of Range - How do I find range of XML?

Answers (2)

Related Questions