Rahul
Rahul

Reputation: 615

Why is if statement not working in ElementTree parsing?

I'm trying to parse an xml file using ElementTree which looks like this:

<Game>
  <Event timestamp="2016-08-14T14:23:33.634" id="1713385925" 
         version="1471181110290" last_modified="2016-08-14T14:25:11" y="11.0" 
         x="89.7" outcome="0" team_id="148" player_id="51327" sec="8" min="23" 
         period_id="1" type_id="4" event_id="205">

    <Q id="733814222" qualifier_id="265"/>
    <Q id="481660420" qualifier_id="286"/>
    <Q id="813378778" qualifier_id="152"/>
    <Q id="570443899" qualifier_id="56" value="Right"/>
    <Q id="420312891" qualifier_id="233" value="248"/>
    <Q id="1186861264" qualifier_id="13"/>
  </Event>

  <Event timestamp="2016-08-14T14:23:33.634" id="1635888622" 
         version="1471181110289" last_modified="2016-08-14T14:25:11" y="89.0" 
         x="10.3" outcome="1" team_id="143" player_id="169007" sec="8" min="23" 
         period_id="1" type_id="4" event_id="248">

    <Q id="1871787686" qualifier_id="56" value="Back"/>
    <Q id="176295814" qualifier_id="13"/>
    <Q id="69346842" qualifier_id="233" value="205"/>
    <Q id="1588029344" qualifier_id="265"/>
    <Q id="559785299" qualifier_id="285"/>
    <Q id="380723313" qualifier_id="152"/>
  </Event>
</Game>

The code I'm using is simple and is working as expected. However, everything changes when I try to add an if condition to the code

import xml.etree.ElementTree as ET

root = ET.parse(r'C:\Users\ADMIN\Desktop\Abhishek\PSG - Copy\Sample.xml').getroot()

Games = root.getchildren()
for Game in Games:
    Events = Game.getchildren()
    for Event in Events:
        type_id = Event.attrib["type_id"]
        team_id = Event.attrib["team_id"]
        Qualifiers = Event.getchildren()
        for Qualifier in Qualifiers:
            id_ = Qualifier.attrib['id']
            if id_ == 142:
                print ("val")

Here's the error it's producing:

Warning (from warnings module):
  File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\PSGPossessionSequences.py", line 9
    Games = root.getchildren()
DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.

Warning (from warnings module):
  File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\PSGPossessionSequences.py", line 11
    Events = Game.getchildren()
DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.

Warning (from warnings module):
  File "C:\Users\ADMIN\AppData\Local\Programs\Python\Python37\PSGPossessionSequences.py", line 15
    Qualifiers = Event.getchildren()
DeprecationWarning: This method will be removed in future versions.  Use 'list(elem)' or iteration over elem instead.

I have tried removing the if statement and that works perfectly. However, I do need to set a condition to call all the id_s which are a certain value. I've tried using "142" as well as 142 but the problem persists. Why exactly is this happening?

Upvotes: 5

Views: 4595

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1123620

The errors you see are not errors, but warnings. You can ignore them, silence them, or fix your code by not using .getchildren(); you can iterate directly over each XML element instead:

root = ET.parse(r'C:\Users\ADMIN\Desktop\Abhishek\PSG - Copy\Sample.xml').getroot()

for Game in root:
    for Event in Game:
        # ...
        for Qualifier in Event:

The if test doesn't work because XML attributes are strings, text, not integer values. Test for a string:

if id_ == "142":
    print("val")

You may want to use XPath queries instead of looping over everything. The base ElementTree implementation that comes with Python is a little limited though. You would get a far more powerful implementation if you installed the lxml library, its XPath support is far superior:

from lxml import etree as ET

document = ET.parse(r'C:\Users\ADMIN\Desktop\Abhishek\PSG - Copy\Sample.xml')
root = document.getroot()

qualifier = root.xpath(".//Event/Q[@id='142']")[0]
event = qualifier.getparent()
type_id = event.attrib["type_id"]
team_id = event.attrib["team_id"]

Upvotes: 4

andrew pate
andrew pate

Reputation: 4297

Its a warning the getchildren() method is depreciated. Here is how to get the children now without the warning

def goddamnit_what_are_my_kids_called(self, element):
    for child in list(element):
        print(child.tag)

Upvotes: 5

Related Questions