Ben Usman
Ben Usman

Reputation: 8387

Scrapy: Invalid XPath

hxs.select("//h:h2[re:test(., 'a', 'i')]").extract()


Undefined namespace prefix
xmlXPathEval: evaluation failed
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/scrapy/selector/libxml2sel.py", line 44,     in select
raise ValueError("Invalid XPath: %s" % xpath)
ValueError: Invalid XPath: //h:h2[re:test(., 'a', 'i')]

I'm new to XPath and Scrapy.

What's wrong with it? (I'm trying to select nodes that contain the word "a").

Upvotes: 1

Views: 1439

Answers (1)

toniedzwiedz
toniedzwiedz

Reputation: 18543

According to the traceback, you're using an undefined namespace prefix re. I'm not familiar with scrapy but it seems you have to define the namespace prefix somewhere.

BTW, isn't the function you're trying to use called matches?

You could call it like this: //h:h2[matches(., 'a', 'i')]

An alternative would be //h:h2[contains(lower-case(.),'a')]

Also, what you said (

What's wrong with it? (I'm trying to select nodes that contain the word "a").

) contradicts the function's semantics. In your snippet, you're actually looking for a string that contains the letter a. Not for a as a word.

If a is the only text in your element, you could also try using: //h:h2[lower-case(.)='a']

Or if you're looking for a as a word in a longer text, you can combine the use of matches with XPath regular expressions.

Upvotes: 3

Related Questions