user2423959
user2423959

Reputation: 834

select the word preceding the number

I've the below XML data.

<para>The functions and duties of the CCS set out in s 6 CA are focused on promoting efficient market conduct and competitiveness of markets in Singapore. Consumer welfare, mentioned in the Singapore&#8211;US FTA, is not expressly mentioned as a purpose of the CA, nor is it expressly set out in the CA as an objective to be safeguarded by the CCS. However, CCS Guideline 1, at para 2.1 on &#8220;Purpose&#8221; of the CA makes reference to how consumers benefit as a consequence of competition. And this is equal to 2.1% of the entire data</para></footnote></para></item>

Here i want to capture the number followed by the word para or paras or paras 1.2 and 0.8 but it should not capture if the data is like general 1.2 and 1.3

the regex i use is as below.

    <xsl:template match="text()">

    <xsl:analyze-string select="." regex="(([Cc]hapter)\s(\d+))">

        <xsl:matching-substring>
            <xsl:choose>
                <xsl:when test="number(regex-group(3)) &lt; number(9)">
                    <a href="{concat('er:#BGCL_CH_',format-number(number(regex-group(3)),'00'),'/','BGCL_CH_',format-number(number(regex-group(3)),'00'))}">
                        <xsl:value-of select="."/>
                    </a>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="."/>
                </xsl:otherwise>
            </xsl:choose>

        </xsl:matching-substring>

        <xsl:non-matching-substring>
            <xsl:analyze-string select="." regex="([0-9]+)\.([0-9]+)">

                  <xsl:matching-substring>
         <xsl:choose>
                                                                <xsl:when test="number(regex-group(1)) &lt; number(9)"> 
                                                                <a
          href="{concat('er:#CLI_CH_',format-number(number(regex-group(1)),'00'),'/P',format-number(number(regex-group(1)),'0'),'-',format-number(number(regex-group(2)),'000'))}">
          <xsl:value-of select="."/>
        </a>
        </xsl:when>
        <xsl:otherwise>


                    <xsl:analyze-string select="."  regex="http://[^ ]+">
                        <xsl:matching-substring>
                            <a href="{.}">
                                <xsl:value-of select="."/>
                            </a>

                        </xsl:matching-substring>
                        <xsl:non-matching-substring>

                            <xsl:value-of select="."/>
                        </xsl:non-matching-substring>
                    </xsl:analyze-string>
                </xsl:non-matching-substring>
            </xsl:analyze-string>
        </xsl:non-matching-substring>
    </xsl:analyze-string>
</xsl:template>

but here it is capturing any number in x.y format, and i want to capture only if the text is in the following formats

para x.y
paras x.y and x.y

and i want to convert the x.y into the below format

er:#BGCL_CH_x/Px-y

please let me know how can i do this.

Thanks

Upvotes: 0

Views: 50

Answers (1)

Joel M. Lamsen
Joel M. Lamsen

Reputation: 7173

try something like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

    <xsl:template match="text()">
        <xsl:analyze-string select="." regex="paras\s([0-9]+)\.([0-9]+)\sand\s([0-9]+)\.([0-9]+)">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="number(regex-group(1)) &lt; number(9)">
                        <a
                            href="{concat('er:#CLI_CH_',format-number(number(regex-group(1)),'00'),'/P',format-number(number(regex-group(1)),'0'),'-',format-number(number(regex-group(2)),'000'))}">
                            <xsl:value-of select="substring-before(., ' and')"/>
                        </a>
                        <xsl:text> and </xsl:text>
                        <a
                            href="{concat('er:#CLI_CH_',format-number(number(regex-group(3)),'00'),'/P',format-number(number(regex-group(3)),'0'),'-',format-number(number(regex-group(4)),'000'))}">
                            <xsl:value-of select="substring-after(., 'and ')"/>
                        </a>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:value-of select="."/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:analyze-string select="." regex="para\s([0-9]+)\.([0-9]+)">
                    <xsl:matching-substring>
                        <xsl:choose>
                            <xsl:when test="number(regex-group(1)) &lt; number(9)">
                                <a
                                    href="{concat('er:#CLI_CH_',format-number(number(regex-group(1)),'00'),'/P',format-number(number(regex-group(1)),'0'),'-',format-number(number(regex-group(2)),'000'))}">
                                    <xsl:value-of select="."></xsl:value-of>
                                </a>
                            </xsl:when>
                            <xsl:otherwise>
                                <xsl:value-of select="."></xsl:value-of>
                            </xsl:otherwise>
                        </xsl:choose>
                    </xsl:matching-substring>
                    <xsl:non-matching-substring>
                        <xsl:value-of select="."/>
                    </xsl:non-matching-substring>
                </xsl:analyze-string>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>

</xsl:stylesheet>

Upvotes: 1

Related Questions