robots.txt
robots.txt

Reputation: 137

Trouble creating an xpath to be able to locate elements conditionally

I have been trying to create an xpath supposed to locate the first three Yes within p elements until the text Demarcation within h1 elements. The existing one which I've used within the below script locates all the text within p elements. However, I can't find any idea to move along. Just consider the one I've created already to be a placeholder.

How can I create an xapth to be able to locate first three Yes within p elements and nothing else?

My attempt so far:

from lxml.html import fromstring

htmldoc="""
<li>
    <a>Nope</a>
    <a>Nope</a>
    <p>Yes</p>
    <p>Yes</p>
    <p>Yes</p>
    <h1>Demarcation</h1>
    <p>No</p>
    <p>No</p>
    <h1>Not this</h2>
    <p>No</p>
    <p>Not this</p>
</li>
"""
root = fromstring(htmldoc)
for item in root.xpath("//li/p"):
    print(item.text)

Upvotes: 1

Views: 50

Answers (2)

eLRuLL
eLRuLL

Reputation: 18799

It looks like you are trying to depend on the h1 tag containing Demarcation, so start from it:

//h1[contains(., "Demarcation")]/preceding-sibling::p[contains(., "Yes")][position()<4]

The idea is to get previous p elements and I added the position()<4 so you only get three, you can remove that if you just need all of the p:

//h1[contains(., "Demarcation")]/preceding-sibling::p[contains(., "Yes")]

Upvotes: 0

Andersson
Andersson

Reputation: 52685

Try below to select paragraphs that are preceding siblings of header "Demarcation"

//li/p[following-sibling::h1[.="Demarcation"]]

Upvotes: 2

Related Questions