Reputation: 41
How can I use the match function of the XPATH to search for whole words in an XML tag?
The follow code return "unknown method matches " :
XML_Doc:=CreateOleObject('Msxml2.DOMDocument.6.0') as IXMLDOMDocument3;
XML_DOC.selectNodes('/DATI/DATO[matches(TEST_TAG,"\bTest\b")]');
Example XML FILE
<DATI>
<DATO>
<TEST_TAG>Test</TEST_TAG>
</DATO>
<DATO>
<TEST_TAG>Test21</TEST_TAG>
</DATO>
<DATO>
<TEST_TAG>Abc</TEST_TAG>
</DATO>
</DATI>
Upvotes: 3
Views: 641
Reputation: 243449
Suppose that by "word" you mean:
Starting with a Latin alphabet letter and all characters contained are either latin letters or decimal digits,
one can use an XPath expression to find exactly these:
//TEST_TAG
[contains('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
substring(.,1,1)
)
and
not(
translate(.,
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789',
'')
)
]
XSLT-based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:copy-of select=
"//TEST_TAG
[contains('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ',
substring(.,1,1)
)
and
not(
translate(.,
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789',
'')
)
]
"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document (the provided one, but with an illegal "word" added):
<DATI>
<DATO>
<TEST_TAG>Test</TEST_TAG>
</DATO>
<DATO>
<TEST_TAG>#$%Test21</TEST_TAG>
</DATO>
<DATO>
<TEST_TAG>Abc</TEST_TAG>
</DATO>
</DATI>
evaluates the above XPath expression and copies the selected elements to the output:
<TEST_TAG>Test</TEST_TAG>
<TEST_TAG>Abc</TEST_TAG>
Do note:
The currently-accepted answer incorrectly produces this:
<TEST_TAG>#$%Test21</TEST_TAG>
as an element whose string value is a "word".
Upvotes: 0
Reputation: 16917
matches
is XPath 2 and Msxml only supports XPath 1.
As far as I know there is no library supporting XPath 2 for Delphi. (although I wrote a XPath 2 library for Freepascal, it should be not so difficult to port)
You could use
/DATI/DATO[not(contains(TEST_TAG," "))]
to find words that do not contain a space, which is XPath 1.
Upvotes: 4