
Reputation: 217

XSLT search and replace punctuation mark

I have an XSLT-cascade transferring XML to TeX. In the last step I have a simple xml file with all text between two tags and I want to apply several search and replace routines.

So an input file like this:


when applied with this XSLT (more or less verbatim taken from Replacing strings in various XML files)

<xsl:stylesheet version="2.0" xmlns:xsl="">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:param name="list">
            <search> / </search>
    <xsl:template match="@*|*|comment()|processing-instruction()">
            <xsl:apply-templates select="@*|node()"/>
    <xsl:template match="text()">
        <xsl:variable name="search" select="concat('(',string-join($list/words/word/search,'|'),')')"/>
        <xsl:analyze-string select="." regex="{$search}">
                <xsl:value-of select="$list/words/word[search=current()]/replace"/>
                <xsl:value-of select="."/>

Should have the following output:




Unfortunately "{–" seems to trigger something and disappears. Can anyone explain why?

Upvotes: 1

Views: 798

Answers (1)

Daniel Haley
Daniel Haley

Reputation: 52888

Glad the original answer you linked to helped. Please consider upvoting if you haven't already. ;-)

The problem is . is special in regex. So <search>.–</search> would match any character followed by -.

You should escape the . in your search variable:

<xsl:variable name="search" select="replace(concat('(',string-join($list/words/word/search,'|'),')'),'\.','\\.')"/>

You will need to escape any other special regex characters as well, so you might consider creating an xsl:function to make that part easier.

Here's an example of a function that will escape . and { for starters...

<xsl:stylesheet version="2.0" xmlns:xsl=""
  xmlns:so="stackoverflow example" exclude-result-prefixes="so">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>
  <xsl:param name="list">
        <search> / </search>

  <xsl:function name="so:escapeRegex">
    <xsl:param name="regex"/>
    <xsl:analyze-string select="$regex" regex="\.|\{{">
        <xsl:value-of select="concat('\',.)"/>
        <xsl:value-of select="."/>

  <xsl:template match="@*|*|comment()|processing-instruction()">
      <xsl:apply-templates select="@*|node()"/>

  <xsl:template match="text()">
    <xsl:variable name="search" select="so:escapeRegex(concat('(',string-join($list/words/word/search,'|'),')'))"/>
    <xsl:analyze-string select="." regex="{$search}">
        <xsl:message>"<xsl:value-of select="."/>" matched <xsl:value-of select="$search"/></xsl:message>
        <xsl:value-of select="$list/words/word[search=current()]/replace"/>
        <xsl:value-of select="."/>

If you uncomment the last word in your list param, it will replace the {– in your example.

Upvotes: 1

Related Questions