sanjay
sanjay

Reputation: 1020

XSLT - replace specific content of the text() node with a new node

I have a xml like this,

 <doc>
    <p>Biological<sub>89</sub> bases<sub>4456</sub> for<sub>8910</sub> sexual<sub>4456</sub>
            differences<sub>8910</sub> in<sub>4456</sub> the brain exist in a wide range of
        vertebrate species, including chickens<sub>8910</sub> Recently<sub>8910</sub> the
            dogma<sub>8910</sub> of<sub>4456</sub> hormonal dependence for the sexual
        differentiation of the brain has been challenged.</p>
</doc>

As you can see there are <sub> nodes and text() node contains inside the <p> node. and every <sub> node end, there is a text node, starting with a space. (eg: <sub>89</sub> bases : here before 'bases' text appear there is a space exists.) I need to replace those specific spaces with nodes.

SO the expected output should look like this,

<doc>
    <p>Biological<sub>89</sub><s/>bases<sub>4456</sub><s/>for<sub>8910</sub><s/>sexual<sub>4456</sub>
        <s/>differences<sub>8910</sub><s/>in<sub>4456</sub><s/>the brain exist in a wide range of
        vertebrate species, including chickens<sub>8910</sub><s/>Recently<sub>8910</sub><s/>the
        dogma<sub>8910</sub><s/>of<sub>4456</sub><s/>hormonal dependence for the sexual
        differentiation of the brain has been challenged.</p>
</doc>

to do this I can use regular expression like this,

<xsl:template match="p/text()">
        <xsl:analyze-string select="." regex="(&#x20;)">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="regex-group(1)">
                        <s/>
                    </xsl:when>                
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

But this adds <s/> nodes to every spaces in the text() node. But I only need thi add nodes to that specific spaces.

Can anyone suggest me a method how can I do this..

Upvotes: 0

Views: 2073

Answers (2)

Tim C
Tim C

Reputation: 70618

If you only want to match text nodes that start with a space and are preceded by a sub element, you can put the condition in your template match

<xsl:template match="p/text()[substring(., 1, 1) = ' '][preceding-sibling::node()[1][self::sub]]">

And if you just want to remove the space at the start of the string, a simple replace will do.

<xsl:value-of select="replace(., '^\s+', '')" />

Try this XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output method="xml" indent="no" />

    <xsl:template match="p/text()[substring(., 1, 1) = ' '][preceding-sibling::node()[1][self::sub]]">
      <s />
      <xsl:value-of select="replace(., '^\s+', '')" />
    </xsl:template>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Upvotes: 2

potame
potame

Reputation: 7905

Just change the regex like so ^(&#x20;): it will match only the spaces at the beginning of the text part.

With this XSL snipped:

<xsl:analyze-string select="." regex="^(&#x20;)">

Here is the result I obtain:

<p>Biological<sub>89</sub><s></s>bases<sub>4456</sub><s></s>for<sub>8910</sub><s></s>sexual<sub>4456</sub>
         differences<sub>8910</sub><s></s>in<sub>4456</sub><s></s>the brain exist in a wide range of
         vertebrate species, including chickens<sub>8910</sub><s></s>Recently<sub>8910</sub><s></s>the
         dogma<sub>8910</sub><s></s>of<sub>4456</sub><s></s>hormonal dependence for the sexual
         differentiation of the brain has been challenged.
      </p>

Upvotes: 1

Related Questions