Reputation: 6908
Below is the extracted div
code from which I need to get the output, tried the usual extraction didn't work
<div class="container-inhalt">
<div class="container-hauptinfo s16">
<a title="Ki-dong Kim" id="0" href="/ki-do190">Ki-Kim</a> </div>
<div class="container-zusatzinfo-small">
<b>Age:</b> 48 Years
<img src="https://tny/87.png?lm=1520611569" title="Korea, South" alt="Ka, Sh" class="flaggenrahmen" /> <br />
<b>Appointed:</b> Apr 23, 2019 <br />
<b>Contract expires:</b> - <br />
<b>Success rate as coach:</b> 1,63 PPM </div>
<div class="container-zusatzinfo">
</div>
</div>
Output: 1,63 PPM
Upvotes: 0
Views: 44
Reputation: 33158
It will be a solid investment if you wish to continue working with webscraping to learn XPath and the XPath Functions because it is almost always possible to describe how to target a specific Node. Then, Scrapy additionally allows running regexes for that "last mile" part:
def parse(self, response):
response.xpath('//b[contains("Success rate as coach:", text())]'
'/following-sibling::node()'
).re(r'\s*(\S+)\s*')
# ['1,63', 'PPM']
Upvotes: 2