XML newbie
XML newbie

Reputation: 1

XML XPath search with Python lxml fails to filter on text() output

Probably missing something obvious - when I filter for general "a" nodes, I get to see their text - including the target link I want - just fine:

ipdb> print [x.text for x in root.xpath(u".//a")]
[u'\u0391\u03c0\u03bf\u03c3\u03cd\u03bd\u03b4\u03b5\u03c3\u03b7', None, ... ]

But when I filter for the specific text contained in the first 'a' element returned above, I get nothing:

ipdb> print [x.text for x in root.xpath(
    u".//a[text()=" + 
    u'\u0391\u03c0\u03bf\u03c3\u03cd\u03bd\u03b4\u03b5\u03c3\u03b7' + 
    u']'  )]
[]
ipdb> 

Any ideas?

Upvotes: 0

Views: 532

Answers (1)

Simon Sapin
Simon Sapin

Reputation: 10180

There are two languages here: Python and XPath. Each of them has quoted strings.

When interpreting the Python syntax the content of the string passed to .xpath() (your XPath expression) is something like this: .//a[text()=Some text]. However literal strings of texts need to be quoted in XPath: .//a[text()="Some text"]. You then need to encode that in a Python string. Here you have a few alternatives:

.xpath('.//a[text()="Some text"]')
.xpath(".//a[text()=\"Some text\"]")
.xpath(""".//a[text()="Some text"]""")

Upvotes: 1

Related Questions