user1244932
user1244932

Reputation: 8082

xsl, copy only one, but have all content?

I have

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<article lang="en">
     <articleinfo>
            <title>Test Article</title>
     </articleinfo>
     <sect1 label="">
            <title>Test1 Section</title>
            <para>Some content</para>
    </sect1>
    <sect1>
            <title>Test2 Section</title>
            <para>Another content</para>
    </sect1>
</article>

I want extract only one sect1 with title "Test1", so I write such simple xsl transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:d="http://docbook.org/ns/docbook">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="article/sect1[contains(title, 'Test1')]">
  <xsl:message>Match</xsl:message>
  <xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>

But for some unknown reason such transformation not copy only sect1 with mentioned title, but also content of all XML file minus tags:

xsltproc ./extract_sect1.xsl test.xml 
Match
<?xml version="1.0"?>


            Test Article

    <sect1 label="">
            <title>Test1 Section</title>
            <para>Some content</para>
    </sect1>

            Test2 Section
            Another content

So question, why I get "Test2 Section" and "Another content", and how can I fix this situation?

Upvotes: 1

Views: 55

Answers (2)

Peter
Peter

Reputation: 1796

Please add this template to your XSLT:

<xsl:template match="text()"/>

and your output will look like this:

<?xml version="1.0" encoding="UTF-8"?>
<sect1 label="">
    <title>Test1 Section</title>
    <para>Some content</para>
</sect1>

Upvotes: 2

Ian Roberts
Ian Roberts

Reputation: 122364

XSLT has default built-in template rules for all nodes, that apply if you don't specify an explicit template. Processing always starts by applying templates to the document root node, and the default rule for elements is just

<xsl:apply-templates/>

(i.e. continue applying templates to child nodes recursively). Your problem is that the default template rule for text nodes is to output the text to the result tree. So for all nodes except the article/sect1[contains(title, 'Test1')] you will get all the descendant text nodes output to the result.

There are two obvious ways around this, you either define

<xsl:template match="text()" />

to suppress the default text node behaviour, or you add

<xsl:template match="/">
  <xsl:apply-templates select="article/sect1[contains(title, 'Test1')]" />
</xsl:template>

to jump straight to the section you're interested in and not apply the default templates to the other elements at all.

Upvotes: 1

Related Questions