Reputation:
basically i want to select a node (div) in which it's children node's(h1,b,h3) contain specified text.
<html>
<div id="contents">
<p>
<h1> Child text 1</h1>
<b> Child text 2 </b>
...
</p>
<h3> Child text 3 </h3>
</div>
i am expecting, /html/div/ not /html/div/h1
i have this below, but unfortunately returns the children, instead of the xpath to the div.
expression = "//div[contains(text(), 'Child text 1')]"
doc.xpath(expression)
i am expecting, /html/div/ not /html/div/h1
So is there a way to do this simply with xpath syntax?
Upvotes: 8
Views: 15526
Reputation: 2705
The following expression gives a node (div) in which any children nodes (not just h1,b,h3) contain specified text (not the div itself):
doc.xpath('//div[.//*[contains(text(), "Child text 1")]]')
you can refine that and return the only the div with the id contents
like in your example:
doc.xpath('//div[@id="contents" and .//*[contains(text(), "Child text 1")]]')
It does not match, if the text is a text node of the div (directly inside the div), which is my interpretation of the question.
Upvotes: 17
Reputation: 186562
You could append "/.." to anchor back to the parent. Not sure if there's a more robust method.
expression = "//div[contains(text(), 'Child text 1')]/.."
Upvotes: 10