Reputation: 62
I'm trying to write a function which gets the domain name from a URL text in XML file i.e www.example.com.
<xsl:function name="fdd:get-domain">
<xsl:param name="url"/>
<xsl:analyze-string select="$url" regex="^(.*)://([a-zA-Z0-9\-\.]?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)?)(/.*)$">
<xsl:matching-substring>
<xsl:value-of select="regex-group(1)"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="false()"/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
This function always returns false
. I'm not sure what am I missing in this.
Upvotes: 1
Views: 1089
Reputation: 243579
Inside an attribute value every {
and }
must be doubled (in order to distinguish them from the single chars that denote an AVT. Just by doubling the curly braces:
^(.*)://([a-zA-Z0-9\-\.]?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{{2,3}}(/\S*)?)(/.*)$
with this correction, when called like this:
fdd:get-domain('http://www.abc/cpm/page.aspx')
the result is:
http
I guess that you really want to get the domain, as this modified code (both the regex expression and the regex-group index) does:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fdd="some:fdd">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select="fdd:get-domain('http://www.abc.com/cpm/page.aspx')"/>
</xsl:template>
<xsl:function name="fdd:get-domain">
<xsl:param name="url"/>
<xsl:analyze-string select="$url" regex=
"^(.*)://([a-zA-Z0-9\-\.]?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{{2,3}})(/\S*)?(/.*)$">
<xsl:matching-substring>
<xsl:value-of select="regex-group(2)"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="false()"/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the wanted, correct result is produced:
www.abc.com
Update: As reminded by Michael Kay, the need to duplicate any curly braces can be avoided if the RegEx is specified as the context of a variable and this variable is referenced as an AVT in the regex
attribute of xsl:analyze-string
:
<xsl:analyze-string select="$url" regex="{$vRegEx}"
flags="mx" >
This has another benefit -- we can split RegEx subexpressions on different lines and even intermix them with comments.
Here is the refactored transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fdd="some:fdd">
<xsl:output method="text"/>
<xsl:variable name="vRegEx">
^(.*) <!-- The scheme -->
://
([a-zA-Z0-9\-\.]?[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}) <!-- The domain -->
(/\S*)?(/.*)$ <!-- the path and query string -->
</xsl:variable>
<xsl:template match="/">
<xsl:sequence select="fdd:get-domain('http://www.abc.com/cpm/page.aspx')"/>
</xsl:template>
<xsl:function name="fdd:get-domain">
<xsl:param name="url"/>
<xsl:analyze-string select="$url" regex="{$vRegEx}"
flags="mx" >
<xsl:matching-substring>
<xsl:value-of select="regex-group(2)"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="false()"/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>
</xsl:stylesheet>
Upvotes: 1