Reputation: 131
I have two types of xml file (pom.xml and descriptors) that I want to join into a single dataset. There is no common key, so I'm taking the two directories and using the project name fragment before the underscore.
I have two variables to work with:
repository="/home/qeebrato/Git/ddt"
uri="file:/home/qeebrato/Git/ddt/eventhandlers_repeatlookup/src/main/resources/descriptors/eventhandlers_repeatlookup.descriptor"
I want "eventhandlers".
To get this project fragment I have
<xsl:attribute name="project"><xsl:value-of select='replace(@uri,"(.*)@repository(^_).*_(^$)","$2")'/></xsl:attribute>
The webpages on XSLT string processing I've seen make no mention of using identifiers inside the regex.
Upvotes: 0
Views: 638
Reputation: 1190
replace()
regexThe replace()
function takes at least three arguments: the input string, the regex pattern to match, and the replacement.
In your sample:
* The input string is the uri
attribute on some element.
* The pattern seems to include the value of the repository
attribute on this same element.
* The replacement is just the second match in the pattern.
The main problem you mention in your post is in the pattern -- you want to include the value of the repository
attribute. To do so, we can follow Martin Honnen's advice from his comment, and use concat()
to construct the string:
concat("(.*)", @repository, "(^_).*_(^$)")
I created a simple test XML document:
<?xml version="1.0" encoding="UTF-8"?>
<test repository="/home/qeebrato/Git/ddt" uri="file:/home/qeebrato/Git/ddt/eventhandlers_repeatlookup/src/main/resources/descriptors/eventhandlers_repeatlookup.descriptor"/>
And a simple XSL file to apply to this test, using the fixed replace()
call above:
<xsl:template match="test">
<xsl:value-of select='replace(@uri,concat("(.*)", @repository, "(^_).*_(^$)"),"$2")'/>
</xsl:template>
Running this XSL against this XML gives me:
file:/home/qeebrato/Git/ddt/eventhandlers_repeatlookup/src/main/resources/descriptors/eventhandlers_repeatlookup.descriptor
... which is identical to the original value of the uri
attribute. Ultimately, your replace()
doesn't do anything.
From the W3C specification:
Summary: The function returns the
xs:string
that is obtained by replacing each non-overlapping substring of$input
that matches the given$pattern
with an occurrence of the$replacement
string.
A careful reading of this, and testing, clarifies that the function returns $input
if $pattern
is valid, but doesn't match anything.
Let's deconstruct your $pattern
regex.
(.*)
-- zero or more characters:@repository
-- the value of the repository
attribute: /home/qeebrato/Git/ddt
$input
string.(^_)
-- this is where things go funny.[^_]
instead, with square brackets, which indicates a character that is not an underscore.(^_)
with round parentheses translates to a capturing match of an underscore at the start of $input
, or at the start of a line, depending on your mode. The replace()
function defaults to ^
matching the start of the whole string. Since there is no underscore at the start of your $input
string, this $pattern
fails to match -- so the function just returns $input
as-is. You say, I want "eventhandlers". If you mean, I want to extract this portion of the string, here's the replace
statement you'd need to get that as output:
replace(@uri, concat(".*", @repository, "/([^_]+)_.*$"), "$1")
Breaking this down:
.*
matches zero or more characters.@repository
plugs in the string value of that attribute: /home/qeebrato/Git/ddt
/
since we need another path separator.([^_]+)
in round parens to capture, and what we capture is +
one or more characters that [^_]
are not an underscore._.*$
matches the following underscore, and then anything else until the end of the string.We replace all that with $1
, our first (and only) captured match, producing eventhandlers
.
You mention in your post that you have two variables. However, you use the @
symbol in your replace()
call, which specifies an attribute value.
If repository
and uri
are actually variables (defined in your XSL using <xsl:variable>
elements) or parameters (defined using <xsl:param>
), then you need to use $
instead of @
.
If you're working with regular expressions a lot, it will likely prove very worthwhile to use a regular expression tool, such as Regex Tester (online), RegExr (online), or RegexBuddy (for pay application; apparently made by the same guy that maintains http://www.regular-expressions.info/).
(Full disclosure: I have used RegexBuddy for years, but otherwise have no relationship with any of these regex websites or tool developers).
Upvotes: 1