sudarsan
sudarsan

Reputation: 41

Python reading xml file with multiple namespace

  <?xml version="1.0" encoding="UTF-8"?>
        <country:list version="3.0" xmlns:country="http://Some/www.home.com" xmlns:region="http://some/www.hello.com" xmlns:Location="http://Some/www.home.com">
            <country:Region111>
                <Some_child_tags>
                    <region:tag1 name="1">some contents in country:Region111 </region:tag1>

                    <tags>

                            .
                            .
                            .
                            .
                    </tags>
                    <region:tag1 name="2">Some other contents in country:Region111</region:tag1>
                </Some_child_tags>
            </country:Region111>

            <Location:Region222>
            <Some_child_tags>
                    <region:tag1 name="1">some contents in Location:Region222</region:tag1>
                    <tags>

                            .
                            .
                            .
                            .
                    </tags>
                    <region:tag1 name="2">Some other contents in Location:Region222</region:tag1>
                </Some_child_tags>
            </Location:Region222>
        </country:list>

I want to retrieve all the <region:tag1> tag contens and attribute values also that is comming under <country:Region111>...</country:Region111> not under <Location:Region222> ....</Location:Region222>. So the final output should be the following

name 1 some contents in country:Region111                                
name 2 Some other contents in country:Region111
      It should eliminate the <region:tag1> contents that is coming from <Location:Region222>.

Upvotes: 0

Views: 1310

Answers (1)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Your input XML document should look as below (to be valid):

<?xml version="1.0" encoding="UTF-8"?>
<country:list version="3.0" xmlns:country="http://Some/www.home.com" xmlns:region="http://some/www.hello.com">
<country:Region111>
     <Some_child_tags>
      <region:tag1>some contents</region:tag1>
     </Some_child_tags>
</country:Region111>
</country:list>

The solution using xml.etree.ElementTree module:

import xml.etree.ElementTree as ET

tree = ET.parse("yourfile.xml")
root = tree.getroot()
tag1 = root.find('.//{http://some/www.hello.com}tag1')  # accessing tag with namespace

print(tag1.text)

The output:

some contents

Upvotes: 1

Related Questions