Reputation: 8387
hxs.select("//h:h2[re:test(., 'a', 'i')]").extract()
Undefined namespace prefix
xmlXPathEval: evaluation failed
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/scrapy/selector/libxml2sel.py", line 44, in select
raise ValueError("Invalid XPath: %s" % xpath)
ValueError: Invalid XPath: //h:h2[re:test(., 'a', 'i')]
I'm new to XPath and Scrapy.
What's wrong with it? (I'm trying to select nodes that contain the word "a").
Upvotes: 1
Views: 1439
Reputation: 18543
According to the traceback, you're using an undefined namespace prefix re
. I'm not familiar with scrapy but it seems you have to define the namespace prefix somewhere.
BTW, isn't the function you're trying to use called matches
?
You could call it like this: //h:h2[matches(., 'a', 'i')]
An alternative would be
//h:h2[contains(lower-case(.),'a')]
Also, what you said (
What's wrong with it? (I'm trying to select nodes that contain the word "a").
) contradicts the function's semantics. In your snippet, you're actually looking for a string that contains the letter a. Not for a as a word.
If a is the only text in your element, you could also try using:
//h:h2[lower-case(.)='a']
Or if you're looking for a as a word in a longer text, you can combine the use of matches
with XPath regular expressions.
Upvotes: 3