Alex Wally
Alex Wally

Reputation: 2339

Remove certain text between XML tags

I need some help to transform this XML document:

<root>
<tree>
<leaf>Hello</leaf>
ignore me
<pear>World</pear>
</tree>
</root>

to this:

<root>
<tree>
<leaf>Hello</leaf>
<pear>World</pear>
</tree>
</root>

The example is simplified, but basically, I could either remove all instances of "ignore me" or everything that's not inside a leaf or a pear.

I've only come up with this XSLT that copies pretty much everything:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" encoding="UTF-8" standalone="yes"/>

    <xsl:template match="root|tree">
        <xsl:element name="{name()}">
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="leaf|pear">
        <xsl:element name="{name()}">
            <xsl:copy-of select="child::node()"/>
        </xsl:element>
    </xsl:template>

</xsl:stylesheet>

What I have found out is how to use xsl:call-template to remove text inside a leaf or pear element, but that didn't work for things inside a tree element.

Thanks in advance.

Upvotes: 3

Views: 5265

Answers (2)

Daniel Haley
Daniel Haley

Reputation: 52848

Here's another option that will remove text from any element with mixed content (both elements and text)...

XML Input

<root>
    <tree>
        <leaf>Hello</leaf>
        ignore me
        <pear>World</pear>
    </tree>
</root>

XSLT 1.0

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*[* and text()]">
        <xsl:copy>
            <xsl:apply-templates select="@*|*"/>
        </xsl:copy>     
    </xsl:template>

</xsl:stylesheet>

XML Output

<root>
   <tree>
      <leaf>Hello</leaf>
      <pear>World</pear>
   </tree>
</root>

Also, if the text really is only ignore me, you could do this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="text()[normalize-space(.)='ignore me']"/>

</xsl:stylesheet>

Upvotes: 2

hr_117
hr_117

Reputation: 9627

Looks like a identity transform is what you are looking for. Because the text as direct child of root or tree should be ignored add empty templates for that. Therefore try:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

    <xsl:output indent="yes" method="xml" encoding="utf-8" omit-xml-declaration="yes" />

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="tree/text()" />
    <xsl:template match="root/text()" />

</xsl:stylesheet>

Which will generate the following output:

<root>
  <tree>
    <leaf>Hello</leaf>
    <pear>World</pear>
  </tree>
</root>

Upvotes: 4

Related Questions