John
John

Reputation: 135

Remove Namespace and Extract a subset of XML file using XSL

When my Input Xml as :

 <country>
       <state>
           <city>
               <name>DELHI</name>            
           </city>
      </state>
    </country>

For required output as below:

<city>
  <name>DELHI</name>            
</city

The following xsl is working fine :

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" omit-xml-declaration="yes" />
    <xsl:template match="/">
        <xsl:copy-of select="//city">
        </xsl:copy-of>
    </xsl:template>
</xsl:stylesheet>

BUT THE SAME XSL IS NOT WORKING FOR THE ABOVE INPUT XML , IF NAME SPACE IS ADDED : Like Below :

<country xmlns="http://india.com/states" version="1.0">
   <state>
       <city>
           <name>DELHI</name>            
       </city>
  </state>
</country>

I want the name space to be removed along with the city element to be copied .

Any help would be appreciated . Thanks

Upvotes: 4

Views: 1927

Answers (2)

Dimitre Novatchev
Dimitre Novatchev

Reputation: 243599

This is the most FAQ on XPath, XML and XSLT. Search for "default namespace and XPath expressions".

As for a solution:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:x="http://india.com/states">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="*">
  <xsl:element name="{name()}">
   <xsl:copy-of select="@*"/>
   <xsl:apply-templates/>
  </xsl:element>
 </xsl:template>


 <xsl:template match="*[not(ancestor-or-self::x:city)]">
  <xsl:apply-templates/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<country xmlns="http://india.com/states" version="1.0">
    <state>
        <city>
            <name>DELHI</name>
        </city>
    </state>
</country>

the wanted result is produced:

<city>
   <name>DELHI</name>
</city>

Explanation:

  1. In XPath an unprefixed element-name is always considerd to be in "no namespace". However, every element name in the provided XML document is in a non-empty namespace (the default namespace "http://india.com/states"). Therefore, //city selects no node (as there is no element in the XML document that is no namespace), while //x:city where x: is bound to the namespace "http://india.com/states" selects all city elements (that are in the namespace"http://india.com/states").

  2. In this transformation there are two templates. The first template matches any element and re-creates it, but in no-namespace. It also copies all atributes and then applies templates to the children-nodes of this element.

  3. The second template overrides the first for all elements that are not ancestors of a city element or not themselves a city element. The action here is to apply templates on all children nodes.

UPDATE: The OP has modified the question asking why there is non-wanted text in the result of processing a new, modified XML document:

<country xmlns="http://india.com/states" version="1.0">
        <state>
            <city>
                <name>DELHI</name>
            </city>
        </state>
        <state2>
            <city2>
                <name2>MUMBAI</name2>
            </city2>
        </state2>
</country>

In order not to produce the text "MUMBAI", the transformation above needs to be slightly modified -- to ignore (not copy) any text node that hasn't an x:city ancestor. For this purpose, we add the following one-line, empty template:

 <xsl:template match="text()[not(ancestor::x:city)]"/>

The whole transformation now becomes:

<xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:x="http://india.com/states">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>

     <xsl:template match="*">
      <xsl:element name="{name()}">
       <xsl:copy-of select="@*"/>
       <xsl:apply-templates/>
      </xsl:element>
     </xsl:template>

     <xsl:template match="*[not(ancestor-or-self::x:city)]">
      <xsl:apply-templates/>
     </xsl:template>

     <xsl:template match="text()[not(ancestor::x:city)]"/>
</xsl:stylesheet>

and the result is still the wanted, correct one:

<city>
   <name>DELHI</name>
</city>

Upvotes: 3

Emiliano Poggi
Emiliano Poggi

Reputation: 24846

You can get the wanted output by using a template like:

 <xsl:template match="*[not(ancestor-or-self::x:*[starts-with(name(),'city')])]">
  <xsl:apply-templates/>
 </xsl:template>

or

 <xsl:template match="/">
     <xsl:apply-templates select="//x:*[starts-with(name(),'city')]"/>
 </xsl:template>

Tested with Microsoft (R) XSLT Processor Version 4.0 on your new input it gives:

<city>
   <name>DELHI</name>
</city>
<city2>
   <name2>MUMBAI</name2>
</city2>

Upvotes: 0

Related Questions