Reputation: 5
I am new in XSLT and if it is possible to get the position of a specific word? For example, I have a data like this:
<Data>The quick brown fox jumps over the lazy dog!</Data>
I want to get the position of a "brown", "over", "dog" and "!". And, store it in different output name. Like the position of brown is <foo>3</foo>
, position of over is <boo>6</boo>
, dog <hop>9</hop>
and ! <po_df>10</po_df>
. Is it possible?
Upvotes: 0
Views: 146
Reputation: 167516
If you were only looking for words you could use tokenize(., '\s+|\p{P}')
<xsl:template match="Data">
<xsl:copy>
<xsl:variable name="words" select="tokenize(., '\s+|\p{P}')"/>
<xsl:for-each select="'brown', 'over', 'dog'">
<matched item="{.}" at-pos="{index-of($words, .)}"/>
</xsl:for-each>
</xsl:copy>
</xsl:template>
which gives
<Data>
<matched item="brown" at-pos="3"/>
<matched item="over" at-pos="6"/>
<matched item="dog" at-pos="9"/>
</Data>
so it has the right positions (I am not sure where the names of the elements you posted (like hop
) are to be taken from so I have not tried to implement that.).
As you also want to identify a punctuation character I am not sure tokenize suffices and even with analyze-string it is not straight-forward to match and collect the position. Maybe someone else has a better idea.
Upvotes: 1