Python:XML List index out of range

Question

I'm having troubles to get some values in a xml file. The error is IndexError: list index out of range

XML



    
        
            
                479183
            
            
                3213213212323
            
            
                
                    7030-314
                
                
                    
                        
                            1
                            10
                            10.35
                            88.79
                        
                    
                
            
            
                
                    7050-6
                
                
                    
                        
                            1
                            00
                            7.49

I'm getting the values from XML, it's ok in some xml's, those having vICMS and vICMSST tags:

vicms = doc.getElementsByTagName('vICMS')[i].firstChild.nodeValue

vicmsst = doc.getElementsByTagName('vICMSST')[1].firstChild.nodeValue

This returns:

First returns:

print vicms
>> 10.35
print vicmsst
>> 88.79

Second imposto CRASHES because don't find vICMSST tag...

**IndexError: list index out of range**

What the best form to test it? I'm using xml.etree.ElementTree:

My code:

import os
import sys
import subprocess
import base64,xml.dom.minidom
from xml.dom.minidom import Node
import glob
import xml.etree.ElementTree as ET

origem = 0
# only loops over XML documents in folder
for file in glob.glob("*.xml"):    
    f = open("%s" % file,'r')
    data = f.read()
    i = 0
    doc = xml.dom.minidom.parseString(data)
    for topic in doc.getElementsByTagName('emit'):
       #Get Fiscal Number
       nnf= doc.getElementsByTagName('nNF')[i].firstChild.nodeValue
       print 'Fiscal Number  %s' % nnf
       print '
'
       for prod in doc.getElementsByTagName('det'):
            vicms = 0
            vicmsst = 0
            #Get value of ICMS
            vicms = doc.getElementsByTagName('vICMS')[i].firstChild.nodeValue
            #Get value of VICMSST
            vicmsst = doc.getElementsByTagName('vICMSST')[i].firstChild.nodeValue   
            #PRINT INFO
            print 'ICMS %s' % vicms
            print 'Valor do ICMSST: %s' % vicmsst
            print '

'
            i +=1
print '

'

Tomalak · Accepted Answer

You are making several general mistakes in your code.

Don't use counters to index into lists you don't know the length of. Normally, iteration with for .. in is a lot better than using indexes anyway.
You have many imports you don't seem to use, get rid of them.
You can use minidom, but ElementTree is better for your task because it supports searching for nodes with XPath and it supports XML namespaces.
Don't read an XML file as a string and then use parseString. Let the XML parser handle the file directly. This way all file encoding related issues will be handled without errors.

The following is a lot better than your original approach.

import glob
import xml.etree.ElementTree as ET

def get_text(context_elem, xpath, xmlns=None):
    """ helper function that gets the text value of a node """
    node = context_elem.find(xpath, xmlns)
    if (node != None):
        return node.text
    else:
        return ""

# set up XML namespace URIs
xmlns = {
    "nfe": "http://www.portalfiscal.inf.br/nfe"
}

for path in glob.glob("*.xml"):
    doc = ET.parse(path)

    for infNFe in doc.iterfind('.//nfe:infNFe', xmlns):
        print 'Fiscal Number	%s' % get_text(infNFe, ".//nfe:nNF", xmlns)

        for det in infNFe.iterfind(".//nfe:det", xmlns):
            print ' ICMS	%s' % get_text(det, ".//nfe:vICMS", xmlns)
            print ' Valor do ICMSST:	%s' % get_text(det, ".//nfe:vICMSST", xmlns)

print '

'

Python:XML List index out of range

Answers (2)

Related Questions