Reputation:
I got this code in this website NTP
<h5>Soundbooster</h5> <br><br>
<p class="details">
<b>Filtro attuale</b>
</p>
<blockquote>
<p>
<b>Catalogo:</b>
Aliant</br>
<b>Marca e Modello:</b>
Mazda - 3 </br>
<b>Versione:</b>
(3th gen) 2013-now (Petrol)
</p>
</blockquote>
And I am trying to extract the element "Mazda - 3" and I am unable to get it, it return blank. In the code, the "Mazda - 3" part is in brand value. I get the name and the version value.
This is how I implemented:
for ntp in response.css('div.content-1col-nobox'):
name = ntp.xpath('normalize-space(//h5/text())').extract_first()
brand = ntp.xpath('normalize-space(//blockquote/p//text()[4])').extract_first()
version = ntp.xpath('normalize-space(//div/blockquote[1]/p//text()[6])').extract_first()
result = ("{} {} - {}".format(name, brand, version))
This post is related to this one, it work there, but I realized that I get only part of the data. See here: Scrapy add.xpath or join xpath
Can anybody help me please.
Thank you in advance.
Upvotes: 0
Views: 51
Reputation: 10666
I'm not sure what ntp
is in your code but this should work:
brand = ntp.xpath('.//b[.="Marca e Modello:"]/following-sibling::text()[1]').extract_first()
Upvotes: 1