Reputation: 823
I have the xml file in the following format.
<root>
...
<start/>
some text <b> bold </b>
<end/>
...
<start/>
some other text <i>italic </i>
<end/>
...
</root>
Please suggest me a xsl template that selects all the text that is between <start/>
and <end/>
tags. Please note <start/>
and <end/>
are empty nodes.
Thank you very much.
Upvotes: 0
Views: 653
Reputation: 28004
I'm interpreting your "all the text" to include not only text nodes themselves but also markup such as <b>
. I'm also assuming that you do not want to select every descendant node between <start/>
and <end/>
but only the top-level ones. Further I'm assuming that all the <start/>
and <end/>
tags are siblings (cannot occur at just any level).
Use the following template to select (and copy) all the text that is between <start/>
and <end/>
tags.
<xsl:template match="/">
<xsl:copy-of select="(//start)[1]/following-sibling::node()[not(self::end) and
name((preceding-sibling::start | preceding-sibling::end)[last()]) = 'start']"/>
</xsl:template>
Update:
Given that your start/end can be at any level, you can remove the -sibling
from the axes above:
select="(//start)[1]/following::node()[not(self::end) and
name((preceding::start | preceding::end)[last()]) = 'start']"
However, this selects all the nodes, not just the top-level ones. (And therefore if you deep-copy the selected nodes, you will get duplicates.) This is because it is not well-defined what behavior should happen if you had something like this:
<start/>
<chapter>foo<end/></chapter>
Should <chapter>
be selected or not?
However if you can put further constraints on where start/end can fall in relation to each other, we can do better. E.g. is every <end/>
a sibling of the preceding <start/>
? If so, you could do
<xsl:key name="text-by-last-milestone" match="* | text()"
use="generate-id((preceding-sibling::start | preceding-sibling::end)[last()])" />
<xsl:template match="/">
<xsl:for-each select="//start">
<xsl:copy-of select="key('text-by-last-milestone', generate-id())"/>
</xsl:for-each>
</xsl:template>
If not, it would be helpful for you to show a more extended sample of your input.
FYI, these tags are referred to as "milestone" markup, so you may be able to find more information about processing them by searching on that term. Depending on what the constraints on your input XML are, they are also discussed as "concurrent markup".
Upvotes: 3