Reputation: 63
I have such a construction
<p>File name</p>
<a href="https://somelink.pdf">Download</a>
I need to capture the link a
and its name p
using CSS and XPath. I'm trying to do the following, first I find using the CSS selector all files whose href
values end in .pdf
(a[href$=".pdf"]
):
for i in response.css('a[href$=".pdf"]'):
link = i.css('::attr("href")').get()
name = i.xpath(?????????)
print(name, link)
How do I capture the text in the p
element using XPath?
Upvotes: 2
Views: 1545
Reputation: 111726
a
This XPath,
//a[.="Download"]/preceding-sibling::p[1]
will select the first p
element siblings preceding each a
element whose string value equals "Download"
.
p
This XPath,
//p[.="File name"]/following-sibling::a[1]
will select the first a
element siblings following each p
element whose string value equals "File name"
.
In either case, you can select the text node child by appending /text()
to the XPaths.
Upvotes: 3