Reputation: 55
I need to find elements containing only digits and dots using XPath 1.0.
For example from elements like these:
89.0.1/
89.0.2/
89.0/
89.0b1/
89.0b10/
89.0b11/
something-else/
It would only leave these:
89.0.1/
89.0.2/
89.0/
Upvotes: 0
Views: 122
Reputation: 479
I am not sure if your package support EXSLT.
If it does, you can use regular expressions.
Add a namespace
re = http://exslt.org/regular-expressions
And use the following XPath:
s="""
<root>
<item>89.0.1</item>
<item>89.0.2</item>
<item>89.0</item>
<item>89.0b1</item>
<item>89.0b10</item>
<item>89.0b11</item>
<item>something-else</item>
</root>
"""
ns = {"re": "http://exslt.org/regular-expressions"}
from lxml import etree
root = etree.XML(s)
root.xpath("item/text()[re:test(., '^[\d\.]+$')]", namespaces=ns)
Upvotes: 1
Reputation: 29022
You can match all elements that have only digits and dots (and the /
char, as in your sample) with this XPath-1.0 expression:
self::*[translate(., '0123456789./','')='']
If you don't need the elements, but rather just a TRUE/FALSE condition, omit the self::*[
and ]
part.
Upvotes: 2