Willi Fischer
Willi Fischer

Reputation: 455

Remove all nodes from an xml except for some

i have a simple xml file like this:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <garbage1>something</garbage1>
    <garbage2>something</garbage2>
    <garbage3>something</garbage3>
    <item>
        <a>
            <b/>
            <c>123</c>
        </a>
        <d>456</d>
    </item>
    <item>
        <a>
            <b/>
            <c>789</c>
        </a>
        <d>666</d>
    </item>
</root>

I want to remove all nodes except <c> and <d> inside the <item> nodes to get a result like this:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <item>
        <c>123</c>
        <d>456</d>
    </item>
    <item>
        <c>789</c>
        <d>666</d>
    </item>
</root>

Apparently the right thing to do this is using the identity transformation and then overriding it accordingly. If I just wanted to remove <c> and <d>, this would do the job:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <!--    identity transformation-->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--    override identity transformation-->
    <xsl:template match="c|d"/>

</xsl:stylesheet>

OK, so I just need to negate the argument to get rid of all nodes that are NOT <c> or <d>. Why does this not work?

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <!--    identity transformation-->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--    override identity transformation to (apparently not) get rid of all nodes except 'c' and 'd'-->
    <xsl:template match="//*[not(local-name() = ('c', 'd'))]"/>

</xsl:stylesheet>

Thank you all very much, I feel like missing something simple here...

Upvotes: 0

Views: 1281

Answers (2)

michael.hor257k
michael.hor257k

Reputation: 117102

First, the correct way to accomplish your goal of:

to remove all nodes except <c> and <d> inside the <item> nodes

is:

XSLT 1.0/2.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*[ancestor::item][not(self::c or self::d)]">
    <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

This matches any node that is "inside the <item>" (i.e. a descendant of item) except c and d, and applies templates to its children, without copying it itself. Thus, for example, the a wrapper is removed - but its cchild is still processed by the identity transform template.


Your attempt did not work, because your second template was applied to the root element. From there, the template did not output anything, nor did it apply any other templates - so the processing ended at that point with an empty result.

Upvotes: 1

Willi Fischer
Willi Fischer

Reputation: 455

I found a way to solve this with a different approach:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:template match="@*|node()">
        <xsl:apply-templates select="@*|node()"/>
    </xsl:template>

    <xsl:template match="root">
        <root>
            <xsl:apply-templates select="@*|node()"/>
        </root>
    </xsl:template>

    <xsl:template match="item">
        <item>
            <xsl:apply-templates select="@*|node()"/>
        </item>
    </xsl:template>

    <xsl:template match="c | d">
        <xsl:copy-of select="."/>
    </xsl:template>


</xsl:stylesheet>

However it still bugs me what is wrong with my original idea, so I'd be very happy if someone can help me there.

Upvotes: 0

Related Questions