Binyamin Even
Binyamin Even

Reputation: 3382

Count tags in an XML doc using Python

I have a pretty complicated XML document, that I want to parse. My first step is to count the number of < H > tags in my XML document. Here is a simplified version of my XML:

<file>
    xmlns="http://www.namespace.co.il"
    <H Id="1012532" W="2198.05">
        ///more tags
    </H>
    <H Id="623478" W="3215.05">
        ///more tags
    </H>
   etc.
</file>  

Now what I want to do is to count the H elements, so here is what I tried:

import xml.etree.ElementTree as ET
tree = ET.parse(xml_file)

ns = {'nmsp': 'http://www.namespace.co.il'}
count =1
for HH in tree.iterfind(str(ns['nmsp']+':H')):
   print count
   count=count+1

When I run this code, nothing is printed to the console. Any idea why?

Upvotes: 1

Views: 1840

Answers (2)

Nander Speerstra
Nander Speerstra

Reputation: 1526

I think your question has already been answered.

The answer by zeekay is:

import lxml.etree
doc = lxml.etree.parse(xml)
count = doc.xpath('count(//author)')

(where, in your version, 'author' should be changed to 'H', I guess)

EDIT:

As to your own code (it took me some time to find it out):

Your loop should be

for HH in root.iterfind('nmsp:H', ns):

As is pointed out here, you need to give information about the namespace dictionary and not just the value under 'nmsp'.

Upvotes: 2

Murali Mopuru
Murali Mopuru

Reputation: 6570

Is this what you looking for?

tree.iterfind('{http://www.namespace.co.il}H')

Upvotes: 2

Related Questions