janlindso
janlindso

Reputation: 1239

Making an index of a book with XSL

I have a XML with the following structure where I wanna make an index of some specific words:

<book>
<chapter title="This is first chapter">
        <section title="This is the first section">
        <paragraph title="This is the first paragraph">This is the paragraph content, where this <index>word</index> should be in the index</paragraph>
        </section>
</chapter>
<chapter title="This is second chapter">
        <section title="This is the first section">
        <paragraph title="This is the first paragraph">This is the paragraph content</paragraph>
        </section>
</chapter>
</book>

So, I wanna make a list of all <index> elements, and here is what I tried:

<xsl:template match="/">
        <xsl:apply-templates select="book/chapter" />
</xsl:template>


    <xsl:template match="chapter">
            <html>
            <head>
                <title>Index</title>
            </head>
            <body>
      <h1>
            Index
        </h1>
        <xsl:apply-templates  />
            </body>
        </html>
    </xsl:template>


    <xsl:template match="index">
    <p> 
        <xsl:value-of select="."/>
      </p>
   </xsl:template>

So, all the words are printed correctly, but the problem is that all the text from the XML is printed as well in a mess (all textnodes are printed after each other). I only want the index elements, and nothing else.

Upvotes: 1

Views: 255

Answers (4)

michael.hor257k
michael.hor257k

Reputation: 116992

Here is a rough attempt to produce a real index - i.e. grouped, sorted and including a list of locations where each entry is found (in the form of chapter#.paragraph#).

Some assumptions are being made here:

  • Your processor supports the EXSLT set:distinct() function (thus avoiding the need for Muenchian grouping);
  • All index entries appear within a paragraph element (although not necessarily as a direct child); each index entry appears only once in the same paragraph;
  • All paragraphs are children of a section; all sections are children of a chapter.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:set="http://exslt.org/sets"
extension-element-prefixes = "set">
<xsl:output method="html" version="1.0" encoding="utf-8" indent="yes"/>

<xsl:key name="index" match="index" use="." />

<xsl:template match="/">
<html>
<head>
    <title>Index</title>
</head>
<body>
    <h1>Index</h1>

    <xsl:for-each select="set:distinct(book/chapter/section/paragraph//index)">
    <xsl:sort select="." data-type="text" order="ascending"/>
    <p>
        <xsl:value-of select="."/>
        <xsl:text> - </xsl:text>
        <xsl:for-each select="key('index', .)">
            <xsl:value-of select="count(ancestor::chapter/preceding-sibling::chapter) + 1"/>
            <xsl:text>.</xsl:text>
            <xsl:value-of select="count(ancestor::paragraph/preceding-sibling::paragraph) + count(ancestor::section/preceding-sibling::section/paragraph) + 1"/>
            <xsl:if test="position() != last()">
                <xsl:text>, </xsl:text>
            </xsl:if>   
        </xsl:for-each>
    </p>
    </xsl:for-each>  
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Upvotes: 1

Zach Young
Zach Young

Reputation: 11188

With the appropriate apply-template select, and I added the chapter title into the index headings.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:apply-templates select="book/chapter" />
  </xsl:template>
  <xsl:template match="chapter">
    <html>
      <head>
        <title>Index</title>
      </head>
      <body>
        <h1><xsl:value-of select="@title"/> Index</h1>
        <xsl:apply-templates select=".//index"  />
      </body>
    </html>
  </xsl:template>
  <xsl:template match="text()">
    <p> 
      <xsl:value-of select="."/>
    </p>
  </xsl:template>
</xsl:stylesheet>

Upvotes: 0

michael.hor257k
michael.hor257k

Reputation: 116992

Actually, I think you need less templates, not more:

<xsl:template match="/book">
<html>
<head>
    <title>Index</title>
</head>
<body>
    <h1>Index</h1>
    <xsl:apply-templates select="descendant::index"/>
</body>
</html>
</xsl:template>

<xsl:template match="index">
    <p><xsl:value-of select="."/></p>
</xsl:template>

Or, if you prefer:

<xsl:template match="/book">
<html>
<head>
    <title>Index</title>
</head>
<body>
    <h1>Index</h1>
    <xsl:apply-templates select="descendant::index" >
        <xsl:sort select="." data-type="text" order="ascending"/>
    </xsl:apply-templates >
</body>
</html>
</xsl:template>

<xsl:template match="index">
    <p><xsl:value-of select="."/></p>
</xsl:template>

Upvotes: 2

Daniel Haley
Daniel Haley

Reputation: 52858

The reason you're getting all of the text nodes is because of XSLTs built-in rules.

Try adding the template:

<xsl:template match="text()"/>

Upvotes: 3

Related Questions