user2925795
user2925795

Reputation: 398

Python:XML List index out of range

I'm having troubles to get some values in a xml file. The error is IndexError: list index out of range

XML

<?xml version="1.0" encoding="UTF-8"?>
<nfeProc xmlns="http://www.portalfiscal.inf.br/nfe" versao="3.10">
    <NFe xmlns="http://www.portalfiscal.inf.br/nfe">
        <infNFe Id="NFe35151150306471000109550010004791831003689145" versao="3.10">
            <ide>
                <nNF>479183</nNF>
            </ide>
            <emit>
                <CNPJ>3213213212323</CNPJ>
            </emit>
            <det nItem="1">
                <prod>
                    <cProd>7030-314</cProd>
                </prod>
                <imposto>
                    <ICMS>
                        <ICMS10>
                            <orig>1</orig>
                            <CST>10</CST>
                            <vICMS>10.35</vICMS>
                            <vICMSST>88.79</vICMSST>
                        </ICMS10>
                    </ICMS>
                </imposto>
            </det>
            <det nItem="2">
                <prod>
                    <cProd>7050-6</cProd>
                </prod>
                <imposto>
                    <ICMS>
                        <ICMS00>
                            <orig>1</orig>
                            <CST>00</CST>
                            <vICMS>7.49</vICMS>
                        </ICMS00>
                    </ICMS>
                </imposto>
            </det>
        </infNFe>
    </NFe>
</nfeProc>

I'm getting the values from XML, it's ok in some xml's, those having vICMS and vICMSST tags:

vicms = doc.getElementsByTagName('vICMS')[i].firstChild.nodeValue

vicmsst = doc.getElementsByTagName('vICMSST')[1].firstChild.nodeValue

This returns:

First returns:

print vicms
>> 10.35
print vicmsst
>> 88.79

Second imposto CRASHES because don't find vICMSST tag...

**IndexError: list index out of range**

What the best form to test it? I'm using xml.etree.ElementTree:

My code:

import os
import sys
import subprocess
import base64,xml.dom.minidom
from xml.dom.minidom import Node
import glob
import xml.etree.ElementTree as ET

origem = 0
# only loops over XML documents in folder
for file in glob.glob("*.xml"):    
    f = open("%s" % file,'r')
    data = f.read()
    i = 0
    doc = xml.dom.minidom.parseString(data)
    for topic in doc.getElementsByTagName('emit'):
       #Get Fiscal Number
       nnf= doc.getElementsByTagName('nNF')[i].firstChild.nodeValue
       print 'Fiscal Number  %s' % nnf
       print '\n'
       for prod in doc.getElementsByTagName('det'):
            vicms = 0
            vicmsst = 0
            #Get value of ICMS
            vicms = doc.getElementsByTagName('vICMS')[i].firstChild.nodeValue
            #Get value of VICMSST
            vicmsst = doc.getElementsByTagName('vICMSST')[i].firstChild.nodeValue   
            #PRINT INFO
            print 'ICMS %s' % vicms
            print 'Valor do ICMSST: %s' % vicmsst
            print '\n\n'
            i +=1
print '\n\n'

Upvotes: 1

Views: 2894

Answers (2)

Tomalak
Tomalak

Reputation: 338228

You are making several general mistakes in your code.

  1. Don't use counters to index into lists you don't know the length of. Normally, iteration with for .. in is a lot better than using indexes anyway.
  2. You have many imports you don't seem to use, get rid of them.
  3. You can use minidom, but ElementTree is better for your task because it supports searching for nodes with XPath and it supports XML namespaces.
  4. Don't read an XML file as a string and then use parseString. Let the XML parser handle the file directly. This way all file encoding related issues will be handled without errors.

The following is a lot better than your original approach.

import glob
import xml.etree.ElementTree as ET

def get_text(context_elem, xpath, xmlns=None):
    """ helper function that gets the text value of a node """
    node = context_elem.find(xpath, xmlns)
    if (node != None):
        return node.text
    else:
        return ""

# set up XML namespace URIs
xmlns = {
    "nfe": "http://www.portalfiscal.inf.br/nfe"
}

for path in glob.glob("*.xml"):
    doc = ET.parse(path)

    for infNFe in doc.iterfind('.//nfe:infNFe', xmlns):
        print 'Fiscal Number\t%s' % get_text(infNFe, ".//nfe:nNF", xmlns)

        for det in infNFe.iterfind(".//nfe:det", xmlns):
            print ' ICMS\t%s' % get_text(det, ".//nfe:vICMS", xmlns)
            print ' Valor do ICMSST:\t%s' % get_text(det, ".//nfe:vICMSST", xmlns)

print '\n\n'

Upvotes: 1

Jared Goguen
Jared Goguen

Reputation: 9010

There is only one vICMSST tag in your XML document. So, when i=1, the following line returns an IndexError.

vicmsst = doc.getElementsByTagName('vICMSST')[1].firstChild.nodeValue

You can restructure this to:

try:
    vicmsst = doc.getElementsByTagName('vICMSST')[i].firstChild.nodeValue
except IndexError:
    # set a default value or deal with this how you like

It's hard to say what you should do upon an exception without knowing more about what you're trying to do.

Upvotes: 3

Related Questions