Lemon Tree
Lemon Tree

Reputation: 63

XPath: How do I capture the previous element?

I have such a construction

<p>File name</p>
<a href="https://somelink.pdf">Download</a>

I need to capture the link a and its name p using CSS and XPath. I'm trying to do the following, first I find using the CSS selector all files whose href values end in .pdf (a[href$=".pdf"]):

for i in response.css('a[href$=".pdf"]'):
    link = i.css('::attr("href")').get()
    name = i.xpath(?????????)
    print(name, link)

How do I capture the text in the p element using XPath?

Upvotes: 2

Views: 1545

Answers (1)

kjhughes
kjhughes

Reputation: 111726

Starting from a

This XPath,

//a[.="Download"]/preceding-sibling::p[1]

will select the first p element siblings preceding each a element whose string value equals "Download".


Starting from p

This XPath,

//p[.="File name"]/following-sibling::a[1]

will select the first a element siblings following each p element whose string value equals "File name".


In either case, you can select the text node child by appending /text() to the XPaths.

Upvotes: 3

Related Questions