Hobbes
Hobbes

Reputation: 2115

Processing text that sits between two nodes

I have an XML that contains some text:

<p>Sentence blah blah blah <a_href=... />,<a_href=... />,<a_href=... />.</p>

The a_href tags will be output as superscripts (using a CSS that sets the superscript for the a_href tag), and I want the commas between the a_href elements to get a superscript as well. So I'm looking for a transformation with this result:

<p>Sentence blah blah blah <a_href... /><sup>,</sup><a_href... /><sup>,</sup><a_href... />.</p>

I don't think I can use Xpath to select only part of the text node, so there's no way to find "an a_href tag followed by a comma and another a_href tag". I can check whether an a_href tag is followed by another a_href tag, but can't check what's between them? The superscript should happen only if there is a comma, or a comma and one space between the a_href nodes. If there's more text, it should not get a superscript.

(edit: renamed the tag to a_href to remove ambiguity, in the actual code the underscore is absent)

Upvotes: 0

Views: 165

Answers (1)

michael.hor257k
michael.hor257k

Reputation: 116959

The superscript should happen only if there is a comma, or a comma and one space between the a_href nodes.

Given a well-formed (!) input such as:

XML

<p>Start <a href="abc"/>,<a href="def"/>, middle <a href="ghi"/>, <a href="jkl"/> and end.</p>

the following stylesheet:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="text()[normalize-space(.)=',' and preceding-sibling::*[1][self::a] and following-sibling::*[1][self::a]]">
    <sup>
        <xsl:value-of select="."/>
    </sup>
</xsl:template>

</xsl:stylesheet>

will return:

<?xml version="1.0" encoding="UTF-8"?>
<p>Start <a href="abc"/>
   <sup>,</sup>
   <a href="def"/>, middle <a href="ghi"/>
   <sup>, </sup>
   <a href="jkl"/> and end.</p>

Note:

You say that the a tags are styled as superscript by CSS; I don't know much about CSS, but I suspect it could handle this task just as well.

Upvotes: 3

Related Questions