PythonLearner
PythonLearner

Reputation: 93

python xml parsing accounting for a missing tag

I am trying to parse xml data from https://www.boardgamegeek.com/xmlapi/collection/eekspider; some games are missing the yearpublished tag(i.e. Nature Trail Game); i handled the missing tag with try/except in my code. i was wondering if there is another way to do it.

import xml.etree.ElementTree as ET
import urllib.request, urllib.parse, urllib.error
lista=[]
url=input('Please enter the url-')
xml= urllib.request.urlopen(url).read()
tree=ET.fromstring(xml)
lista=tree.findall('*')

for value in lista:
    try:

         print("Game name:",value.find('name').text)
         print("Publication Date:",value.find('yearpublished').text)
         #print("Statistics:", value.find('stats').attrib)
         print("----------")
         game=value.find('name').text
         counts[game]=counts.get(game,0)+1
         date=value.find('yearpublished').text
    except:
        pass
        print("Publication Date: unknown")
        print("----------")

Upvotes: 2

Views: 1893

Answers (1)

klaas
klaas

Reputation: 1811

Since the documentation says that the find() method

Returns an element instance or None

You should also be able to test if the returned value is None like this:

val = value.find('yearpublished')
if val is not None:
    date = val.text

Doc for find() here

This is the actually resulting code:

for value in lista:
    print("----------")
    print("Game name:",value.find('name').text)
    date = "unknown"
    game=value.find('name').text
    #counts[game]=counts.get(game,0)+1
    val = value.find('yearpublished')
    if val is not None:
        date = val.text
    print("Publication Date:",date)

Sample output: (skipping the first lines..)

...
Game name: New World: A Carcassonne Game
Publication Date: 2008
----------
Game name: Nippon Rails
Publication Date: 1992
----------
Game name: Rat Hot
Publication Date: 2005
----------
Game name: Risk
Publication Date: 1959
----------
Game name: Russian Rails
Publication Date: 2004
----------
Game name: Skip-Bo
Publication Date: 1967
----------
Game name: Starship Catan
Publication Date: 2001
----------
Game name: Super Scrabble
Publication Date: 2004
----------
Game name: Ticket to Ride: Nordic Countries
Publication Date: 2007
----------
Game name: Times Square
Publication Date: 2006
----------
Game name: Upwords
Publication Date: 1981
----------
Game name: Xanth
Publication Date: 1991
----------
Game name: Zombie Fluxx
Publication Date: 2007

Upvotes: 1

Related Questions