Rajendra V
Rajendra V

Reputation: 43

parsing XML with namespace in python 3 gives no data

I have a XML with 3 namespaces.

<?xml version="1.0" encoding="UTF-8"?>
<cus:Customizations xmlns:cus="http://www.bea.com/wli/config/customizations" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xt="http://www.bea.com/wli/config/xmltypes">
  <cus:customization xsi:type="cus:EnvValueCustomizationType">
    <cus:description/>
    <cus:envValueAssignments>
      <xt:envValueType>working manager</xt:envValueType>
      <xt:location xsi:nil="true"/>
      <xt:owner>
        <xt:type>FLOW</xt:type>
        <xt:path>/somedir/dir/somepath3</xt:path>
      </xt:owner>
      <xt:value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
    </cus:envValueAssignments>
  </cus:customization>
  <cus:customization xsi:type="cus:FindAndReplaceCustomizationType">
    <cus:description/>
    <cus:query>
      <xt:resourceTypes>ProxyService</xt:resourceTypes>
      <xt:resourceTypes>SMTPServer</xt:resourceTypes>
          <xt:resourceTypes>SSconection</xt:resourceTypes>
      <xt:refsToSearch xsi:type="xt:ResourceRefType">
        <xt:type>FLOW</xt:type>
        <xt:path>/somedir/dir/somepath2</xt:path>
          </xt:refsToSearch>
      <xt:includeOnlyModifiedResources>false</xt:includeOnlyModifiedResources>
      <xt:searchString>Search String</xt:searchString>
      <xt:isCompleteMatch>false</xt:isCompleteMatch>
    </cus:query>
    <cus:replacement>Replacement String</cus:replacement>
  </cus:customization>
  <cus:customization xsi:type="cus:ReferenceCustomizationType">
    <cus:description/>
    <cus:refsToBeConsidered xsi:type="xt:ResourceRefType">
      <xt:type>FLOW</xt:type>
      <xt:path>/somedir/dir/somepath</xt:path>
    </cus:refsToBeConsidered>
        <cus:refsToBeConsidered xsi:type="xt:ResourceRefType">
      <xt:type>WSDL</xt:type>
      <xt:path>/somedir/dir/somepath</xt:path>
    </cus:refsToBeConsidered>
    <cus:refsToBeConsidered xsi:type="xt:ResourceRefType">
      <xt:type>ProxyService</xt:type>
      <xt:path>/somedir/dir/somepath</xt:path>
    </cus:refsToBeConsidered>
    <cus:externalReferenceMap>
      <xt:oldRef>
        <xt:type>FLOW</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:oldRef>
      <xt:newRef>
        <xt:type>FLOW</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:newRef>
        </cus:externalReferenceMap>
    <cus:externalReferenceMap>
      <xt:oldRef>
        <xt:type>XMLSchema</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:oldRef>
      <xt:newRef>
        <xt:type>XMLSchema</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:newRef>
    </cus:externalReferenceMap>
    <cus:externalReferenceMap>
      <xt:oldRef>
        <xt:type>XMLSchema</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:oldRef>
      <xt:newRef>
        <xt:type>XMLSchema</xt:type>
        <xt:path>/somedir/dir/somepath</xt:path>
      </xt:newRef>
    </cus:externalReferenceMap>
  </cus:customization>
</cus:Customizations>

I am using lxml in python 3 but I am getting empty data. when I print the root it gives me root tag. here is my code.

#!/usr/bin/python3

import sys
import os
import os.path
import csv
import xml.etree.ElementTree as etree
import lxml.etree

times = []
keys = []
tree2 = lxml.etree.parse('/home/vagrant/dev_dir/ALSBCustomizationFile.xml')
NSMAP = {'cus': 'http://www.bea.com/wli/config/customizations',
         'xsi': 'http://www.w3.org/2001/XMLSchema-instance',
         'xt': 'http://www.bea.com/wli/config/xmltypes'}

root22 = tree2.getroot()

print(root22)
namespace = root22.findall('cus:Customizations', NSMAP)
namespace2 = root22.findall('xsi:customization', NSMAP)
namespace3 = root22.findall('xt:envValueType', NSMAP)

print(namespace3)

when I run this script I get below output.

<Element {http://www.bea.com/wli/config/customizations}Customizations at 0x7faadb3a0508>
[]

I am able to get the root tag, but not able to access the inner namespace tags.

Can you please help where I am going wrong. how do I read the data in all the inner namespace tags.?

Upvotes: 0

Views: 272

Answers (1)

har07
har07

Reputation: 89285

That's becuase the target element you're trying to get is not direct child of the root element. You need to either specify full path from root to the target element :

namespace3 = root22.findall('cus:customization/cus:envValueAssignments/xt:envValueType', NSMAP)

or using relative descendant-or-self axis (.//) at the beginning of the XPath :

namespace3 = root22.findall('.//xt:envValueType', NSMAP)

For executing more complex XPath expression later you better off using lxml's xpath() method which provide better XPath support :

namespace3 = root22.xpath('.//xt:envValueType', namespaces=NSMAP)

Upvotes: 1

Related Questions