Reputation: 1684
This is a piece of HTML from which I'd like to extract information from:
<li>
<p><strong class="more-details-section-header">Provenance</strong></p>
<p>Galerie Max Hetzler, Berlin<br>Acquired from the above by the present owner</p>
</li>
I'd like to have an xpath expression which extracts the content of the 2nd <p> ... </p>
depending if there's a sibling before with <p> ... Provenance ... </p>
This is to where I got so far:
if "Provenance" in response.xpath('//strong[@class="more-details-section-header"]/text()').extract():
print("provenance = yes")
But how do I get to Galerie Max Hetzler, Berlin<br>Acquired from the above by the present owner
?
I tried
if "Provenance" in response.xpath('//strong[@class="more-details-section-header"]/text()').extract():
print("provenance = yes ", response.xpath('//strong[@class="more-details-section-header"]/following-sibling::p').extract())
But am getting []
Upvotes: 0
Views: 58
Reputation: 146510
You should use
//p[preceding-sibling::p[1]/strong='Provenance']/text()
Upvotes: 1