Oleg
Oleg

Reputation: 303

xslt matching for text between xml processing-instructions

Given the following xml:

<items>
<item id="item1">
    <description id="desc">
        <?RELAPP description="Relative" loc="start"?>
        <heading id="h1" level="1">HEADING 1</heading>
        <p id="p2" num="1">Paragraph A</p>
        <?RELAPP description="Relative" loc="end"?>
        <?SUMM description="Summary" loc="start"?>
        <heading id="h2" level="1">HEADING 2</heading>
        <p id="p3" num="2">Paragraph B</p>
        <p id="p4" num="3">Paragraph C</p>
        <heading id="h3" level="1">HEADING 3</heading>
        <p id="p5" num="4">Paragraph D</p>
        <p id="p6" num="5">Paragraph E</p>
        <?SUMM description="Summary" loc="end"?>
        <?drawings description="Drawings" loc="start"?>
        <drawings>
            <heading id="h4" level="1">HEADING 4</heading>
            <p id="p7" num="6">Paragraph F</p>
            <p id="p8" num="7">Paragraph G</p>          
        </drawings>
        <?drawings description="Drawings" loc="end"?>
    </description>
</item> 
</items>

I'm trying to get to the text between:

<?SUMM description="Summary" loc="start"?>

and

<?SUMM description="Summary" loc="end"?>

That is:

HEADING 2 Paragraph B Paragraph C HEADING 3 Paragraph D Paragraph E

hopefully with some separation between the Headings and Paragraphs.

The best xsl I've been able to come up with is:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/> 
<xsl:template match="/items">
    <myItems>
        <xsl:apply-templates/>
    </myItems>
</xsl:template> 

<xsl:template match="item">
    <xsl:element name="info">
        <xsl:element name="summaryPI">          
            <xsl:for-each select="description/processing-instruction('SUMM')">
                <xsl:value-of select="."/>
            </xsl:for-each>         
        </xsl:element>
    </xsl:element>
</xsl:template>
</xsl:stylesheet>

but it only gets me this:

<?xml version="1.0" encoding="UTF-8"?>
 <myItems>
  <info>
   <summaryPI>description="Summary" loc="start"description="Summary" loc="end"</summaryPI>
  </info>
</myItems>

What rule should I use to get the text I want? I tried with preceding-sibling and following-sibling but I couldn't get it to work. I'm using version 1.0.

Upvotes: 0

Views: 559

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 116959

How about:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/">
    <xsl:for-each select="//text()[preceding::processing-instruction('SUMM')[contains(., 'loc=&quot;start&quot;')]]
                                  [following::processing-instruction('SUMM')[contains(., 'loc=&quot;end&quot;')]] ">
        <xsl:value-of select="." />
        <xsl:if test="position()!=last()">
            <xsl:text>, </xsl:text>
        </xsl:if>   
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

Applied to your input example, the result will be:

HEADING 2, Paragraph B, Paragraph C, HEADING 3, Paragraph D, Paragraph E

Note: if it can be assumed that all the nodes in-between the two processing instructions are siblings (as they are in your example), then this could be made a little more efficient by using:

<xsl:for-each select="//*[preceding-sibling::processing-instruction('SUMM')[contains(., 'loc=&quot;start&quot;')]]
                         [following-sibling::processing-instruction('SUMM')[contains(., 'loc=&quot;end&quot;')]] ">

Upvotes: 1

Related Questions